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Abstract 

We estimate the wave speed in the acoustic wave equation from boundary measurements by con¬ 
structing a reduced-order model (ROM) matching discrete time-domain data. The state-variable rep¬ 
resentation of the ROM can be equivalently viewed as a Galerkin projection onto the Krylov snbspace 
spanned by the snapshots of the time-domain solution. The success of our algorithm hinges on the 
data-driven Gram-Schmidt orthogonalization of the snapshots that suppresses multiple reflections and 
can be viewed as a discrete form of the Marchenko-Gel’fand-Levitan-Krein algorithm. In particular, the 
orthogonalized snapshots are localized functions, the (squared) norms of which are essentially weighted 
averages of the wave speed. The centers of mass of the squared orthogonalized snapshots provide us 
with the grid on which we reconstruct the velocity. This grid is weakly dependent on the wave speed in 
traveltime coordinates, so the grid points may be approximated by the centers of mass of the analogous 
set of squared orthogonalized snapshots generated by a known reference velocity. We present results of 
inversion experiments for one- and two-dimensional synthetic models. 

Keywords. Gel’fand-Levitan, model reduction, optimal grids, Galerkin method, full waveform inversion 
AMS Subject Classifications. 86A22, 35R30, 41A05, 65N21 


1 Introduction 

In seismic reflection tomography, one attempts to utilize measurements of elastic waves to create an (ap¬ 
proximate) image of a region in the earth’s subsurface. In this paper, we present a nonlinear tomographic 
inversion method that can be placed within the so-called full waveform inversion (FWI) framework. Full 
waveform inversion algorithms employ the full equations of motion and utilize as much of the information 
contained in the recorded waveforms as possible to image the material properties of the region of interest 

m- 

The most common numerical approach to FWI is nonlinear optimization, i.e., minimization of the misfit 
between the measured elastic field and the forward model — see, e.g., [Ull^ (and the references within). The 
images created via the optimization approach tend to have high resolution; however, the conventional FWI 
optimization procedure suffers from a few computational and theoretical difficulties. First, the equations and 
models are typically discretized on a fine grid to ensure the synthetic data sets are accurately computed — 
the model parameters tend to be on the order of billions [2T]. Even with the help of adjoint-state methods, 
the solution to 3D FWI problems can take days or weeks of processing time. The second difficulty with the 
optimization problem is that the quadratic misfit functional is nonconvex and has many local minima m- 
Gradient-based algorithms will tend to get stuck in one of these local minima, rather than the true minimum, 
unless the initial model is extremely close to the true model. Several approaches have been developed to 
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mitigate the effects of the nonconvexity of the misfit functional — see mui] and the references therein — 
though they come at a cost. 

Another, direct, nonlinear approach originated from several celebrated works by Marchenko, Krein, 
Gel’fand, and Levitan (MKGL) [351|311 [Ml ISl ISS] . The main idea of this approach is the reduction of 
the inverse problem to a nonlinear integral equation with Volterra (triangular) structure that can be solved 
explicitly. It yields a very powerful tool for inverse hyperbolic problems in ID [331 [Ml jMJ [T^ UIl [25] (and 
the references therein). The main difficulty involved in the application of this layer-stripping-type approach 
in the multidimensional setting is the fact that the scattering data is overdetermined. Recently, progress 
was made in extending the Marchenko and Gel’fand-Levitan approaches to 2D and 3D settings, see, e.g., 
[MIBZ], though more work must be done to improve the lateral resolution of the images in each layer. We 
also point out the related work by Bube and Burridge m, in which the authors solve the ID problem 
by deriving a finite-difference scheme that corresponds exactly to a continuum problem with a piece-wise 
constant coefficient. 

In this paper we apply the discrete MKGL approach (that can be expressed via the Lanczos algorithm 
well known in the linear algebra community) within the reduced-order model (ROM) framework. The ROM 
is obtained by matching discrete time-domain data and its finite-difference interpretation yields a data-driven 
discretization scheme. 

Reduced-order models recently became popular tools for the solution of frequency-domain, diffusion- 
dominated inverse problems, such as diffusive optical tomography, the quasi-stationary Maxwell equations, 
etc. [I1[IS]. The system’s order was reduced by projecting the original system onto a pre-computed or 
dynamically-updated basis of frequency-domain solutions, and then using the projected system as a fast 
proxy in the optimization process. A subspace size sufficient for accurate approximation of the forward 
solver is critical for the success of the method. 

As we shall see, the MKGL approach applied within the ROM not only allows us to obtain images directly 
without optimization, but also to compute sufficiently accurate ROMs with a single Galerkin basis obtained 
for a background (e.g., constant coefficient) model. 


1.1 Reduced-order models and optimal grids 


Our inversion algorithm employs a projection-based ROM. In model order reduction, one replaces a large- 
scale problem with a smaller, more computationally efficient model that retains certain features of the larger 
model — see, e.g., the review article by Antoulas and Sorensen |2] and the book by Antoulas [T] (and the 
references therein). 

We now describe in some detail a particular ROM that is closely related to the model we construct in 
this paper. Consider the following one-dimensional problem for x 6 [0,1]: 

u'{x) - Xu{x) = 0, n'(0) =-1, u(l) = 0, (IT) 

where A e C\] - oo, 0[ is a constant. The impedance function, also known as the Neumann-to-Dirichlet map, 
Poincare-Steklov operator, or Weyl function, is defined by 


/(A)ew(O). 

We wish to construct a small discrete model (a ROM) that accurately computes the impedance function 
/(A) for, say, A 6 [Ai,A 2 ] c [0,oo[. 

To that end, we consider the staggered grid (see Figurein § A.6 in the appendix): 

0 = Xi = Xo < Xi < a:2 < $'2 < ••• < xat-i < xn < 1; 

the stepsizes are hj = Xj+i-Xj and hj = Xj-Xj-i for j = 1,..., A^. A three-term finite-difference approximation 
of (1.11 on this grid is [T3] 


1 
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where Uj « u{xj). This may be written in matrix form as 

AU + AU = -J-ei, 
hi 

where A € U € and ei € contains a 1 in its first component and zeros elsewhere. The discrete 

impedance function is then defined by 


/^(A)e(7i«m(0) = /(A). 

The goal is to choose the stepsizes hj, hj in such a way that /at(A) is an excellent approximation /(A) with 
N small. 

For example, if the grid spacing is uniform and A » 1, U will be a good approximation to u over the 
entire interval [0,1]; in particular, /Ar(A) will be a good approximation to /(A). However, if we are only 
interested in obtaining a good approximation to the solution at x = 0 (i.e., the impedance function), taking 
A » 1 is inefficient. A proper reduced-order model should have /wCA) very close to /(A) for A small. 

As Kac and Krein observed [31], the discrete impedance function /jv may be written as a Stieltjes 
continued fraction [44j with the grid steps hj, hj as coefficients; in particular, 

/nW = - - -■ 

hiX +- 


/12A + ••• +- - - 

hN-i +- — 

hN ^+ 7 — 

hN 

If the grid steps are judiciously chosen, /jv will be a Fade approximant of / and therefore converge to / 
exponentially as A ^ 00 [161 [29l [18] . In other words, Jn will be an excellent approximation to / even if 
A is quite small. These grids are thus known in the literature as optimal grids, and have been successfully 
applied in other related contexts as well [niia. There is also an intimate connection between optimal grids 
and the Galerkin method. In particular, to every A-term Galerkin approximation there corresponds a stable 
three-term finite-difference scheme of no more than A nodes that has the same impedance function m ; we 
will exploit a similar idea when we construct our ROM based on Galerkin projection. Finally, optimal grids 
have been generalized to variable-coefficient Sturm-Liouville problems as well [5]. 

Optimal grids have also been applied to inverse Sturm-Liouville problems [^ . Their usefulness in inverse 
problems stems from the fact that optimal grids are weakly dependent on the variable coefficients of the 
problem. This extraordinary property allows one to use the optimal grids constructed for the constant 
coefficient Sturm-Liouville problem 0) as the grids in an inversion algorithm [2, and has also been used 
in the context of inverse spectral problems [8] and electrical impedance tomography 019]. This idea of the 
weak dependence of optimal grids on the PDF coefficients plays a crucial role in our inversion algorithm as 
well, although we should emphasize that it only holds in traveltime coordinates in the context of the wave 
equation (whereas it holds in physical coordinates in the case of Sturm-Liouville problems). 

1.2 Direct inversion algorithm for FWI in ID 

To fix the idea, let us consider the one-dimensional acoustic wave equation on [0,a;niax] x [0,T]: 

-Uxx{x,t) + ^ uu{x,t) = 0, u{x,0) = b{x), ut{x,0) = 0, 

v^{x) 

subject to appropriate boundary conditions at a: = 0 and x = Xmax- The goal of the forward problem is 
to determine u for t e [0,r] given the wave speed v and the source distribution b (which we assume is a 
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smooth approximation of the delta function). We study the inverse problem of estimating v given the source 
distribution b and 2n equally-spaced samples of the time-domain transfer function 

■^max 2. 1 

f{t)= b{x)u{x,t)^—dx<^^—u{0,t). 

Jo v^[x) 

In other words, we are given b and fk = fikr) for k = 0, ...,2n - 1 and a timestep r > 0 and wish to 
approximate the wave speed v in the interior of the domain [0, Xmax]- We will see that the choice of r plays a 
crucial role in the quality of the inversion results, but we can typically take r to be near the Nyquist-Shannon 
limit of the cutoff frequency of b. The transfer function / is called the single-input/single-output (SISO) 
transfer function in control theory terminology, implying that it was obtained via single-source (input) and 
single-receiver (output) measurements. 

The core of our inversion algorithm is essentially a discrete version of the Krein-Gel’fand-Levitan- 
Marchenko method [Ml El EH eI [MIES]; also see the works by Gopinath and Sondhi [26l US] , Symes [45] , 
Burridge [H], Santosa m, and Habashy [M] for more on the Gel’fand-Levitan method in the continuous 
case. A summary of our application of this method is as follows. We consider the 2n time-domain snapshots 

Uk(x) = u{x, kr) for A: = 0,..., 2n - 1, 
and we define a “matrix” U of the first n snapshots, i.e., 

U = [uo{x),...,Un-i{x)]. 

Because b{x) is an approximation of the delta function, it is localized near a; = 0. Then, due to causality, 
the matrix U will be an approximation of an upper triangular matrix (reminiscent of the “upper triangular” 
kernel from Gel’fand-Levitan theory [M])- We may orthogonalize the snapshots via the Gram-Schmidt 
process and obtain the QR decomposition U = VIZ. Since U is already approximately upper triangular, 
the “matrix” V of the orthogonalized snapshots will be an approximation of the identity matrix, i.e., the 
orthogonalized snapshots are localized. In physical terms, orthogonalization suppresses multiple reflections. 

Unfortunately, we do not have access to the true snapshot matrix U because the wave speed v is unknown 
(so the snapshots are also unknown). However, as we discuss in §[^ in traveltime coordinates the centers of 
mass of the squared orthogonalized snapshots are weakly dependent on the wave speed v. Thus we compute 
the snapshots u^{x) corresponding to a reference velocity which we typically take to be constant. After 
orthogonalization, the centers of mass of the reference squared orthogonalized snapshots approximate the 
centers of mass of the true squared orthogonalized snapshots, and, hence, provide us with a grid for inversion. 
(This is similar to the weak dependence of the grid on the parameters in [^.) 

In our approach, we orthogonalize the snapshots via the Lanczos algorithm without normalization. In 
this case, the (squared) norm of each orthogonalized snapshot contains information about the magnitude 
of V near the center of mass of the squared orthogonalized snapshot; thus the orthogonalized snapshots not 
only provide us with a grid for inversion, but they also provide us with knowledge about the wave speed on 
that grid. 

The crucial feature of our orthogonalization process is that, depending the available data, the computa¬ 
tion of these norms can be performed in two isomorphically equivalent ways. If the velocity, and, hence, the 
snapshots, are known, the norms are computed explicitly in the Lanczos algorithm. On the other hand, if 
only the time-domain data is available, we show that the norms correspond to parameters of a ROM that 
interpolates the discretely sampled time-domain data. In fact, this data-driven, projection-based ROM cor¬ 
responds to the Galerkin method on a (Krylov) subspace spanned by the snapshots and may be constructed 
solely from the discrete time-domain data. The spectral coefficients of the Galerkin approximation satisfy a 
three-term finite-difference recursion that reproduces the data fk exactly, and the coefficients of the finite- 
difference matrix are related to the norms of the orthogonalized snapshots in a simple way. (For more on 
the construction of ROM based on projection onto polynomial and rational Krylov subspaces, see the book 
by Antoulas [T] and the paper by de Villemagne and Skelton [HI; Gallivan, Grimme, and Van Dooren |22j 
and Grimme m discuss the relationship between model order reduction via Krylov projection and rational 
interpolation.) 

We should also discuss the important work of Bube and Burridge m. in which the authors solve the 
ID inversion problem using a finite-difference scheme and Cholesky factorization. Our method also involves 
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a finite-difference scheme and a Cholesky factorization (see Remark 4.6), but the fundamental difference 
between our finite-difference scheme and that of Bube and Burridge is that ours is equivalent to Galerkin 
projection onto the space of orthogonalized snapshots. Indeed, the novelty of the ROM approach discussed 
in this paper is data-driven Galerkin discretization that yields localization of the basis functions. 

In summary, our algorithm may be outlined as follows: 


1. Record the data fk = f{kT) for fc = 0,..., 2n - 1 and t near the Nyquist limit. 

2. Gompute the snapshots u^(x) = vP{x,kT) corresponding to the reference velocity v^(x) (typically we 
take u°(a::) s u(0) for all x € [0,a;max])- 

3. Orthogonalize the snapshots via the Lanczos process (equivalently, the Gram-Schmidt procedure) 
— the grid nodes Xj (in traveltime coordinates) we use for our inversion are given by the centers of 
mass of these squared reference orthogonalized snapshots. 


4. From the recorded data fk, construct the projection-based ROM that interpolates /fc for fc = 0,1,..., 2n- 
1. Use it to compute the norms of the true orthogonalized snapshots. 

5. The estimate of the velocity at the grid point Xj is proportional to a ratio of the norms of the j**' true 
and reference orthogonalized snapshots. 


Since our algorithm is direct, it avoids the difficulties associated with iterative gradient-based algorithms 
that we described earlier. In particular, our algorithm cannot become trapped in a local minimum. Addi¬ 
tionally, we only need to solve a single forward problem (to compute the reference snapshots in step 2), and 
the reference velocity for this forward problem is typically very simple (e.g., constant). Finally, one may 
use our algorithm as a direct imaging algorithm (as we do in this paper), or as a nonlinear preconditioner 
(similar to that in |10j l which generates a reasonable initial model close to the true model m that can 
be used in least-squares optimization. 

The remainder of our paper is organized as follows. In § [^ we define the problem. We discuss the 
orthogonalization of the snapshots in § [^ Gonstruction of our data-driven, interpolatory ROM, based on 
Galerkin projection onto the Krylov subspace spanned by the snapshots, is discussed in § |^ We develop 
our inversion algorithm in § [^ and demonstrate it via several numerical experiments in § [^ We describe a 
two-dimensional extension of our algorithm in § [^ Detailed proofs of many of the lemmas are given in the 
appendix. 


2 Problem formulation 

We start with the Cauchy problem for the Green’s function for the one-dimensional wave equation on 

[0, Xmax] ^ 


Ag + gtt = 0, gx\x=o = 0, = 0, g\t=o = S{x + 0), 5 t|t=o = 0, 


( 2 . 1 ) 


where 




dx^ 


with the Neumann-Dirichlet boundary conditions from (2.1), and the wave speed v{x) is a regular enough, 
positive function on [0,a:niax]- Here and throughout the paper, (5(a; + 0) denotes the “right-half” Dirac delta 
function and satisfies the normalization 


oo 

5{x + 0) dx 

0 


1 . 


We study the inverse problem of determining v{x) from the boundary data g\x=o- For regular enough 
boundary data and for all x e [OjCCmax] there is a unique mapping associating the data g{0,t) to the velocity 
v(x) where te [0,2af(a:)] and the slowness (traveltime) coordinate transformation is 



1 

v(x') 


dx'; 


( 2 . 2 ) 





Inversion via projection-based model reduction 


6 


see, e.g., [HEl[Ml[31123 [21 SI III]• 

The Cauchy problem (2.1) can be equivalently rewritten on [0, a;max]x] - oo[ as 
Ag + gtt = 5{x + Q)6{t)t, g^\^^Q = 0, = 0, g\t<o = 0. 

We introduce the weighted inner product ((•,•)) on T2[0, Xmaxli defined by 

1 


{u,w)) = 


f 

Jo 


u{x)w{x) 


v^(x) 


dx. 


(2.3) 


(2.4) 


We note A is self adjoint and positive definite with respect to ((•,•)); real functions of A (continuous on the 
spectrum of A) are self adjoint with respect to this weighted inner product as well. 


The solution of (2.11 can be formally written via an operator function as 

g{x,t) = cos ^t\/A^ 6{x + 0) = J cos{t\/~^ p{x,\) d\, 


where 


p{x,\) = f^5{X-Xi)^^^zi{x) 


1=1 


v{oy 


(2.5) 


( 2 . 6 ) 


is the vector spectral measure associated with A and {Xi,zi{x)) are eigenpairs of A. Here we have used the 
fact that, because A is self-adjoint with respect to the inner product ((•,•)}, the eigenfunctions can be chosen 
to be orthonormal, i.e., {{zi,Zk)) = Sik where Sik is the Kronecker delta. Note also that the eigenfunctions zi 
must satisfy the homogeneous Neumann-Dirichlet boundary conditions associated with A, namely zi j.(0) = 0 
and .2^/(Xniax) “ 0- 

We use the Green’s function from (2.1) to study a problem with a variable source wavelet g{t)t (in place 
of S{t)t in (2.3)). We assume q € Li] - oo,oo[ is an even, sufficiently smooth approximation of 6{t) with 
nonnegative Fourier transform 


g E .F(g)(s) = J 2cos{ts)q{t) dt. (2.7) 

To fix the idea, we use the Gaussian 

for some cr > 0; in this case, 

g(s^) = exp|-^^j. (2.9) 

We should choose a to be small so that ( |2.8[ ) gives a good approximation to S{t). Physically, a gives a 
measure of the duration of the source wavelet q{t)t in time, and, as can be seen from ( |2.9[ ), cr is inversely 
related to the bandwidth of this wavelelQ As we will see (most prominently in § [^, the time-domain 
measurement sampling rate is closely related to cr. 

This choice of g yields the equation 

Ay+mt = ^(a; + 0)g(t)t, 'gx\x=o = 0, '9\x=x^,^^ = 0, lim g = 0 

t-* — oo 


on [0,a;i„ax]x] - oo,oo[. The solution to this equation can be written via a convolution integral as 

/ oo 

H{t-t')g{x,t-t')q{t')dt', (2.10) 

oo 

where the Green’s function g solves (13 and H is the Heaviside step function. 

* Strictly speaking, the Gaussian pulse has an infinite bandwidth; however, for all practical purposes, the decay of ^is rapid 
enough that we may speak of an “effective bandwidth,” namely values of beyond which q(s^) is sufficiently small. 
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Let u(x,t) = 'g{x,t) +'g{x,-t). Then, using 'g = T ^ [T{Hg)T{q)^ (which follows from (2.10) and the 
convolution theorem for Fourier transforms) and (2.7), we obtain 

u{x,t) = — / cos{ts)iR[lF{Hg)lF{q)] ds = — / cos{ts)iyi[lF{Hg)]q(s^) ds. (2-11) 

TT Jo TT Jo 


For q = (5(t), from (|2.3[), (|2.5[), and (2.10) we have 


^ oo 

I{x,t) = g{x,t) = 2 J cos{ts)p(^x,s^) sds 


for t>0. Comparing this with (2.11) (and taking q{s^) = 1), we find iR[IF{Hg)] = 7rp(a;,s^)s. Combining 
this with (2.11), for general q we have 


^ OO 

u{x,t) = 2 J cos(ts)p(a;,s^) s5'(s^) ds 
^t\/A j p{x, A)q(A) dX 
= cos ^tVA^ q{A)6{x + 0). 

This implies u solves the following Cauchy problem on [0,a;niax] x [0,oo[: 


( 2 . 12 ) 


Au + utt = 0, u2,|3,=o = 0, = 0, M|t=o = ?(^)'5(a; +0), ut|t=o = 0. (2-13) 

Our measurements are defined for t e [0,T] by f{t) = u(0,t). In practice, we only take measurements at 
the discrete times /cr for fc = 0,..., 2n - 1, where (2n - l)r = T and r is the sampling timestep. We choose 
a time discretization step r > 0 consistent with the Nyquist-Shannon sampling of the cutoff frequency of q, 
i.e., we take t a. Our goal is to solve the following problem. 

Problem 2.1. Estimate 'c|[o,^-i(t)] from = u{0,kT), fc = 0,..., 2n - 1, provided x~^(T) < Xmax- 
We will see that the choice of r influences the quality of the inversion results. 


3 Continuum interpretation 


The solution {2A2\ at the discrete times kr is 


u{x, kr) = cos ^krVA^ q{A)5{x + 0) 

= cos ^fcarccoscos q{A)S{x + 0) 

= Tfc ^cos (^T\^A^'jq{A)S{x + 0), 


(3.1) 


where Tk is the fc**' Chebyshev polynomial of the first kind. 

We define the propagation operator P = cos(T^/]4). Then, from the spectral representation (2.12), we 
can equivalently rewrite (3.1) as 


7{x, kr) = Tk{P)q{A)S{x + ^) = f ^ Tk{p)r]{x, p) dp, 


(3.2) 


where 


OO / 

r]{x,p) = 2 Y, sgn(j)g| 

j = -oo \ 


(arccos(/x) + 2j7r)^ \ arccos(/x) + 2jTr 


p\x 


(arccos(^) + 2j7r)^ \ 1 




(3.3) 
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and we take sgn(O) = 1; the infinite summation is due to the multiplicity of arccos (see § A.l in the appendix 
for a derivation of (3.2)-(3.3)). Then the data are given by 


fk = j^ Tk{p)r]o{p)dp, 


(3.4) 


where poip) = ?y(0,/r). 
We define 


c= fo = r]o{fJ,)dp. 


(3.5) 


If we assume q'is positive (this assumption holds for the Gaussian source q{t) in (2.8)), then (3.3) and (3.5) 
imply c"^77o(/r) dp is a probability measure. We also conjecture that f c“^? 7 o(/r) dp has at least n points of 
increase on [-1,1]. This can be proven if the wavespeed v is regular enough, but for the sake of brevity and 
clarity we provide a qualitative explanation in § E3 


Definition 3.1. Suppose q{A) is positive definite (this is true for the Gaussian source in (2.8), for example). 
Let u{x, t) he the solution to the following Cauchy problem on [0, Xmax] x [0,oo[; 


Au + utt = 0, 


^x\x=0 ' 


= 0, M|t=o = b, Ut\t=o = 0, 


(3.6) 


where 


h{x) = v{Q)q{AYl^5{x+Q). (3.7) 

(This equation is equivalent to (2.13) except for the initial condition — in fact, u(x, t) = u(0)“^g'(A)^'^^u(x, t).) 


Then, for k = 0,... ,2n - I, the snapshots are defined by 

Uk{x) = u{x, kr) = cos (^kr-s/A^ b{x) = Tk ^cos (TVA))b{x) = Tk{P)b{x). 


(3.8) 


From the definition of the snapshots and the fact that functions of A (such as q{Ay/^) are self adjoint 
with respect to the inner product ((•,•)), the data satisfy 


Recall that 


fk = {{uo,Uk)) = {{b,Tk{P)b)) for A: = 0,... ,2n - 1. 


U = [uo(a^),ui(a::),...,M„_i(a:)]. 


(3.9) 

(3.10) 


Sometimes for shorthand and for w,u e L2[0,a:niax] we will write w*u = {{w,u)), so by referring to (3.10) 


as a matrix we imply the corresponding multiplication rules. In particular, multiplication from the left by 
another matrix W = [wo(a^),... ,Wn-i{x)] of the same form is defined as 


W*U = 


{{wo,Uo)) 

{{wi,uo)) 


{{wo,Ui)) 


({wo,Un-l)) 

{{wi,Un-l)) 


_{{Wn-l,Uo)) {{Wn-l,Ui)) ... ((w„_i, U„_i))_ 


(3.11) 


If our assumption that c ^rjoip) dp has at least n points of increase is satisfied, then rank U = n and 
Range U is the Krylov subspace 


ICn{uo,P) = span{ito,Puo,... ,P” ; 

see, e.g., § 3.2.1 of the book by Liesen and references therein. In particular, U*U is symmetric and 
positive definite since U is of full rank. 

In the remainder of this section, we derive an algorithm for orthogonalizing the snapshots. As we will see, 
the orthogonalized snapshots are localized in some sense, so they provide the key to our inversion algorithm. 
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3.1 First-order finite-difference Galerkin formnlation 


Because the snapshots can be written in terms of Chebyshev polynomials as in (3.8 1 and the Chebyshev 
polynomials satisfy a three-term recurrence relation, the snapshots satisfy the following second-order time¬ 
stepping Cauchy problem in operator form: 


'^k+l + Uk-i 


= ^{P)uk, Uq = 6 , U-I = Ml, 


where 




(3.12) 

(3.13) 


From a Taylor expansion (for regular enough u), we obtain 

2 


^{P)u = —cos M = -Au + O (||(rA)^M||). 


i.e., (3.121 can be viewed as an explicit time discretization of ( |3.6[ ) that reproduces the snapshots exactly. 

We now state several useful lemmas; the proofs which are not given here are contained in the appendix. 
In the first lemma, we transform (3.61 to slowness coordinates. 


Lemma 3.2. Suppose u solves ( |3.6| ), and let 

U{x,t) = U{x{x),t), V{x) = V(x{x)), and X^ax = x(Xma.x), 


where the (invertible) slowness coordinate transformation x{x) is defined in (2.2). Then u is the solution of 
the following Cauchy problem on [0,afmax] x [0,oo[.- 


where 


'^x\x=C ~ 0; '^\x=x^s,x ~ '^|t=0 “ ^t|i=0 — O5 

d (\du\ 


Au + utt = 0, 

b{x) = q(A) ^ S(x + 0) and Au = -v-—{-—\ 

ox \v ox) 


(3.14) 


with the Neumann-Dirichlet boundary conditions in (3.14). The operator A is self adjoint and positive 
definite with respect to the inner product where 


{u,w) 


Ijv 


r 


:_1 _ 

u{x)w{x) dx. 
v{x) 


We now define a dual variable, w, that will be useful in the remainder of the paper. 

Definition 3.3. We define the dual variable, denoted by w, as the solution of the following Cauchy problem 
on [0,Xmax] X [0, oo[.- 


Cw + wtt = 0, 


where 


1 db 

w\x=o = 0, w^\x=x^^^ = 0, ■u?|t=o = 0, M;t|t=o = - 7 ^ 3 , 

V ox 


1 a /.affii 


(3.15) 


with the Dirichlet-Neumann boundary conditions in ( 3.15[ ). The operator C is self adjoint and positive 
definite with respect to the inner product where 

G ^max 

J u(x)w{x)v{x) dx. 


The Cauchy problems (3.14) and (3.15) can be rewritten in first-order form as in the following lemma. 


Tn physical coordinates, the operator C is given by Cw = with the boundary conditions = 0 and 


'^x\x=x 
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Lemma 3.4. Suppose u and w are the solutions to the following Cauchy problem on [0, Xmax] x [0,oo[; 


'Wx= -Ut, Ux=VWt, 
V 


= 0, w\x=o = 0, u\t=o = b, w\t=o = 0. 


(3.16) 


Then u solves (3.14) and w solves (3.15). 


The next definition is an extension of Definition 13.1 


Definition 3.5. Let u and w be the solutions to (3.16) (so u is the solution to ( |3.14 1 and IS is the solution 
to ( 3.15| ) ). Then, for fc = 0,..., 2n - 1, the primary snapshots are Uk = u{x, kr), and the dual snapshots are 
iSk = w{x,{k + 1I2 )t) . 


Note that the primary snapshots, Uk, are simply the snapshots from Definition |3.1[ namely Uk, trans¬ 
formed into slowness coordinates; i.e., Uk{x) = Uk{x{x)). 

In the next lemma, we give expressions and finite-difference recursions for the primary and dual snapshots. 

Lemma 3.6. Suppose u, w are the solutions to ( 3.16[ ). Then, for fc = 0,..., 2n - 1, the primary snapshots 
are given by 

Uk{x) = Tk (P)uo(x), 

T\/A\ and uo{x) = b(x) = 6{x + 0). This implies the primary snapshots satisfy the 


reeursion 


Uk+i - 2ufc + Uk-i 


= (,{P)uk for k = 0,...,2n-2, uq = b, ui = U-i, 


where ^ is defined in (3.13). 


Similarly, for k = 0,... ,2n - 1, the dual snapshots are given by 

uyfc(3f) = [Tf {Pc) + Ti^\{Pc)]wo, 


(3.17) 


(3.18) 


where Pc = cos I T\/C y T^'^ is the k*^ Chebyshev polynomial of the seeond kind (withT^ = 0 andT^ = -l). 


(3.19) 


and Wo = w(x,0.5t). This implies the dual snapshots satisfy the recursion 

Wk+i - 2wk + Wk-i 


= f{Pc)wk for k = 0,...,2n-2, 


Wo + W-i = 0, Wo = sin 


(0, 


V OX 


In the following lemma, we rewrite the recursions from Lemma 3.6 in first-order form. 


Lemma 3.7. The second-order time-stepping schemes (3.17) and (3.19) can be equivalently rewritten as the 
first-order “leapfrog” discretization of (3.16). In particular, 

for fc = 0,..., 2n - 1, 


Wk - Wk-l 1 X ~ 

~ _ -^T^k 

r V 


'^k+l '^k 


= -vL^Wk for fc = 0,..., 2n - 2, 


(3.20) 


uo = b, 1^0 + W-i = 0; 

here Lx"'" is the adjoint of Lx with respect to the standard inner product on L 2 [ 0 ,afmax], 

Lx =‘^ ■ smid.bT\/~^ , and ^ sinfo.Sr'yi) 

T ox \ / T v \ / 

In particular. Lemma 3.7 implies the operators ^ (P) and ^ (Pc) may be factored as 


OX 


f{p\=-vLx'^-Lx and £, (Pc) =--LxvLx"'"■ 
v V 


(3.21) 

The upshot of this section is that the snapshots in Definition |3.5| may be generated via finite-difference 
schemes — the second-order finite-difference schemes are given in Lemma |3.6| while the equivalent first-order 
finite-difference scheme is given in Lemma |3.7[ This theme permeates the remainder of this section — as we 
will see, all of our first-order algorithms and recursions have second-order equivalents. 
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3.2 Orthogonalization of the snapshots 


It turns out the orthogonalized snapshots are localized (we will justify this in later sections), so they are 
useful as a basis for an inversion method. In particular, the (squared) norm of each orthogonalized snapshot 
contains information about the magnitude of the velocity near the point about which that orthogonalized 
snapshot is localized (specifically, the center of mass of the corresponding squared orthogonalized snapshot). 
We discuss our inversion algorithm in more detail in §[^ for now, we focus on orthogonalizing the snapshots. 

Lemma 3.6 implies the first n primary and dual snapshots span the Krylov subspaces 


/C“ (mo, P) = span {mq, Pmo, 


,P”-'So} 


and 

{wo,Pc) = span{wo,Pcwo,.. .,Pg~^Wo} , 

respectively. The classical method for constructing an orthonormal basis of a Krylov subspace is the Lanczos 
algorithm [40], and the algorithm we use is a first-order equivalent of the Lanczos algorithm. We begin by 
dehning some useful operators. 


Definition 3.8. We define the operator C by 


C = 




Then the time-stepping scheme (3.20) can he written as 


c 

Uk 

= dr 

Uk 


Wk 


Wk_ 


for A: = 0,..., 2n - 1, 


(3.22) 


where 



1 


T 


Wk - Wk-l 


(3.23) 


(Technically speaking, U 2 n not defined — we may 
the inner product by 


define it through (3.22) for completeness.) 


We define 



Ifv^v 






The operator C is anti-self-adjoint with respect to the inner product i.e.. 



Ijv^v 


Next, we project the operator C onto the Krylov subspaces spanned by the snapshots, namely /C“ (uo,P) 
and KTf {wq,Pc)- Before presenting the algorithm, we introduce some notation. 

We denote the orthogonalized primary and dual snapshots by Uj and Wj, respectively, for j = l,...,n. 
(Note that we have shifted the index by 1 — the snapshots Uk and Wk are indexed from fc = 0 to fc = n- l.) 
We store the orthogonalized snapshots in “vectors” of the form 


U2j-1 



and 


C/2,= 



for j = l,...,n. 


or, even more compactly, in a “matrix” 


Q=[Ur,...,U2n] 


Ml 0 M2 0 

0 Mil 0 mJ2 


0 

W„ 


(3.24) 


(3.25) 
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The Lanczos algorithm constructs a tridiagonal matrix 7~ € ]R2 "x 2" such that 


- QT H-C^2n+ie^„, 

In 


(3.26) 


where 7 „ is a constant we define later and U 2 n+i appears because, in general, Lanczos tridiagonalization is 
run on a family of snapshots that may be infinite (or with dimension at least 2n+ 1 — see, e.g., [40] 1: C/ 2 n+i 
will not be needed for the remainder of the paper. Since C is anti-self-adjoint and the columns of Q are to 
be orthogonal, the diagonal components of 'T must be 0. To obtain the desired orthogonality properties, we 


take 




T=OT 

-1 

(3.27) 

where 


'0 -1 






0 = 

1 0 

-1 

1 0 


r E diag (7i,7i,72,72,---,7n,7n), 

(3.28) 

and, for j = 1,. 

..,n, 

I3 = 


= {uj,Uj)y^ and 

13= \\Wj\\~^ = ■ 

(3.29) 


Then (3.241-( 3.29) give the first-order algorithm for the orthogonalization of the first n primary and dual 
snapshots, which is summarized in Algorithm |3.1[ 


Algorithm 3.1 Orthogonalization of Snapshots 

Input: u{x,0) = 6 (x), u, Xmax, n, Lt, and Lt^ 

Output: 7 j, 7 j, and orthogonalized snapshots uj, Wj for j = 1,... ,n 
Set wq = 0 and ui = b. 

for 7 = 1 ,..., n do 



2 . Wj = Wj-i +jj-LrUj; 


II II 2 /-lEmax ’ 

f {wjYvdx 

Jo 

_ _ _ 'JP 

4. Uj + I = Uj - "fjVLr Wj. 

end for 


We pause to consider a couple of important features of Algorithm |3.1[ First, note that the recursion steps 
(steps 2 and 4) resemble a finite-difference algorithm that exactly computes the orthogonalized snapshots, 
since _ _ 

7/ , -1 — 7/ , rr, 

and 


Uj+l ~ Uj _ rp _ 

- - = -vLr W, 


V 


if Uj and Wj are localized in some sense (as we claimed above), then, due to steps 1 and 3, 7 j 


Ij l3 

Second 

and 7 j are related to localized averages of the velocity (roughly speaking). This is a key insight for our 
reconstruction algorithm — 7 j and 77 give us estimates of pointwise values of v near where the squared 
orthogonalized snapshots are localized, i.e., on the optimal grid defined by the centers of mass of the squared 
orthogonalized snapshots. Admittedly, this explanation is not complete; we will add more details in later 
sections. Third, in Algorithm 3.1 we assume v (hence v) is known; in § 4.3 we compute 7 ^, 7 ^ from the 
measured data without any a priori knowledge of v. Finally, the following proposition summarizes the 
important properties of Algorithm] 3. 1 | 
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Proposition 3.9. Suppose Uj, wj (j = are obtained via Algorithm 3.1. Then (ui,Uj).^i~ = Ijdij 

and {wi,Wj)~ = for i, j = 1,... ,n. Moreover, 

spanjui,... ,u„} =/C“ (uo,P) and span{wi,... ,Wn} = {wq, Pc) . 


The next two lemmas show that the first-order algorithm in Algorithm |3.1| is equivalent to the Lanczos 
algorithm. 


_1 /<2 

Lemma 3.10. Suppose the functions Uj (j = 1,... ,n) are constructed via Algorithm 3.1 Then Uj = 7 ^- ' dj, 
where the functions dj are obtained from the following Lanczos algorithm: 

Input: ui = u{x,0) = b{x), v, Xmax; n, and f (P) 

Output: 7 j and normalized, orthogonalized primary snapshots dj for j = 1,... ,n 

Set do = 0 and di = 771 - 7 — • 
hi 1 1/77 

for j =l,...,n do 

1. ai; = {d,,f{P)d,)^^^; 

2. r=[f{P)-a-l]d,-bl,d,.r; 

= \l(p 

4- = 


end for 


Moreover, the Lanczos coefficients o“, 5“ from the above algorithm are related to'^j, 7 ^- from Algorithm 3.1 
by 

for j = l,...,n, 


Ij \Tj- 
b^- 1 


(3.30) 


for j = l,...,n-l, 


where we have taken 70 e 00 . 

Lemma 3.11. Suppose the fv 
where the functions Qj are obtained from the following Lanczos algorithm: 

Input: wi = 7 i —LrWi (from Algorithm 


— 1 /2 

Lemma 3 . 11 . Suppose the functions Wj (j = 1,... ,n) are constructed via Algorithm 3.1 Thenwj=^j ' gj, 


3.1), v, Xmax, n, and f [Pc] 


Output: 7 j and normalized, orthogonalized dual snapshots Qj for j = 1,... ,n 


Set £<o = 0 and Qi = 7 — 


Wi 


a 


for j =l,...,n do 

1. af = {gj,i{Pc)gj),^; 

2. r=[f{Pc)-a'fl]e,-hJ_,B,.,; 
3- bj = 


4- Qj+i 


end for 

Moreover, the Lanczos coefficients aj, bj from the above algorithm are related to'^j, 7 ^ from Algorithm 
by 


3.1 
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Recall that, before orthogonalization, the primary and dual snapshots can be represented in terms of 
Chebyshev polynomials of the operators P and Pc, respectively (see Lemma 3.6). The next lemma and 
the remark following it give representations of the orthogonalized primary and dual sna pshots in terms of 

however, is 


3.12 


polynomials of the operators ^{P) and f (Tc)i respectively. The true value of Lemma 
that it provides a proper normalization for the derivation of explicit formulas for the continued fraction 
coefficients (i.e., and jj) in terms of the data in both the scalar (ID) and matrix (2D and higher) cases. 
We relegate the proofs to the appendix. 


Lemma 3.12. Suppose the orthogonalized snapshots Uj and Wj (j = 1,... ,n) are obtained via Algorithm 
Then ^ 

Uj = (^)) wi, where q^{0) = 1 

and q'j is a polynomial of degree j - 1; similarly, 


3.1 


'^3 = if {^{Pc))wi, where qf{0)= — ^% 

7i k=i 


and qf is a polynomial of degree j - I. 


Remark 3.13. Using the fact that, in spatial coordinates x, di{x) = '^f^b{x), one can show 
where is the set of orthonormal polynomials generated by Algorithm ^4-l\ (below) with the inner product 

1 

{P,q)y,^(e) = - 
c 




in place of the inner product (•, ■)y g. 

4 Transformation of the time-domain data to an equivalent finite- 
difference reduced-order model 


Our goal in this section is to construct a finite-difference scheme involving a data-driven reduced-order 
model for the propagator P = cos(tV 3^) that reproduces the data (3.4) exactly. The coefficients of this 
finite-difference scheme (which is also our ROM) are essentially localized averages of the velocity. Thus 
the construction of the ROM is the core of our inversion method, since it transforms the time-domain data 
(which is all we have) into a “more usable” form. 


4.1 Chebyshev moment problem in Galerkin—Ritz formulation 

We solve the data-interpolation problem by constructing a Gaussian quadrature rule with nodes 9j and 
weights y'j for the weight rjo (defined in (|3.4|)); that is, we find spectral nodes 9j and weights y'j such that 


X l n 

Tk{p)vo{u) dp = = fk ioT k = 0,...,2n- 

^ 3 = 1 


1 . 


(4.1) 


This is the classical quadrature problem (in the Chebyshev basis), and the existence and uniqueness of its 
solution are given by the following well-known result — see, e.g.. Theorems 1.7, 1.19 (which can be extended 
to discrete measures), and 1.46 in the book by Gautschi [23]. 

Lemma 4.1. Let c~^riQ{p) dp be a (probability) measure such that ff^c~^rio{p) dp has at least n points 
of increase on [-1,1] (collectively, such points are also known as the support or spectrum of the measure 


c ^r]o{p) dp). Then (4.1) has a unique solution with positive yj and noncoinciding 9j e (-1,1). 


As we discussed in § 
hypothesis of Lemma |4.1 


3| for regular enough wavespeed v it can be shown that c ^rio{p)dp satisfies the 
— see § |A.2| in the appendix. 
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There are numerous algorithms for the quadrature problem (4.11 (see, e.g., the end of § 1.4.1 and Chapter 
3 in [23)1: however, for the sake of the continuum interpretation of our approach we give an algorithm based 
on the Galerkin projection method onto Krylov subspaces. The proofs of the remaining lemmas in this 
section are given in the appendix. 

The following lemma gives the Galerkin representation of Uk and fk in the Krylov subspace ICniuo, P). 


(4.2) 

(4.3) 

(4.4) 


Lemma 4.2. If tjq satisfies the hypothesis of Lemma \4.1[ then 

Mfc = C/Tfe(H)ei for k = 0,...,n-l, 

and 

fk = el (t7*C/)Tfe(H)ei /or A: = 0,..., 2n - 1, 

where 

H E {U*Uy^{U*PU) € 

We give the spectral decomposition of the matrix H in the next lemma. 


Lemma 4.3. Suppose tjq satisfies the hypothesis of Lemma f.l and H is defined as in (4.4). Then H is self 
adjoint with respect to the inner product defined by 

(x, E [{U*uy/^xf [{U*uy/^z] = x^(t/*C/)z for x. 

The spectral decomposition of H can be written as 


z e. 


H= 


(4.5) 


where & is a diagonal matrix of the eigenvalues o/H and $ is the U*U-orthonormal eigenvector matrix, 


i.e., ^^U*U^ = l. 


Substituting (4.5) into (4.3) we obtain 


fk = X^Tk{®)x for A: = 0 ,... ,2 n-1 , where x = ^^U*Uei. 


Comparing (4.6) and (4.1), we derive 

diag 6., = ® and {yi, ■ ■ ■ ,yn)'^ = X- 


(4.6) 


(4.7) 


In other words, once we know © and x "we may compute the nodes 9j and weights 2 /| for the Gaussian 
quadrature (4.1). 


The matrices U*U and U*PU (and, hence, H via (4.4)) can be computed in terms of the data via the 
following lemma (the proof is given in ^ A.14| of the appendix). 

Lemma 4.4. We use the notation T1 (first column, first row) for Toeplitz matrices and Hffirst column, last 
row) for Hankel matrices. Then if we set 

T°ET([/o,/l,/2,...,/n-l],[/o,/l,/2,...,/n-l]), 

T^ET([/i,/2,/3,...,/„],[/l,/o,/l,...,/n-2]), 

T'eT([/i,/o,/i,...,/„_2],[/i,/2,/3,...,/„]), 

H° E H( [/o, /l , / 2 , /„-l] , / 2 „- 2 ] ) , 

H" E H([/i, /2, /3, . . . , fnl [fn, fn.l, fn. 2 , ■■■, fzu-l]), 

H = H([/i, /o, /l, - . . , fn-2], [fn- 2 , /ra-1, /n, • • ■ , /2n-3]), 

we get the expressions 


U*PU =^{T^ + T~ + W + U~) and 17*t/= ^ (T° + H'*) . 


(4.8) 


In summary, formulas ( |4.4[ )-( [4~^ provide the algorithm for computing yj and 9j from the data for 


Finally, substituting (4.5) into (4.2) we obtain 

Uk = ZTk{@)x for A: = 0, ...,n-l. 


(4.9) 


where Z =U^. By construction, Zsj and 9j are the Ritz pairs of P on the Krylov subspace lCn{uo, P). 
















Inversion via projection-based model reduction 


16 


4.2 Finite-difference recursion 

Let us find a symmetric, tridiagonal matrix 


= 


ai fli 

Pi 012 


Pn-1 


Pn-l 


= € 


such that 


b^Tfc(P„)b„ = £ y]Tk{e,) = fk for /c = 0,..., 2n - 1, 


j=i 


where c is defined in (3.5) and b„ e ^/cel. Taking fc = 0 in ( |4.11 ) gives 

n 

'Eyj = fo= / mid) dp. 

J-1 


C = 


(4.10) 


(4.11) 


(4.12) 


1=1 


The expression on the left in (4.11) is the ROM for the data as expressed in (3.9). We will see that P„ and 
b„ are the projections (up to scaling for b„ of the propagator P = cos(r\/]4) and the source/measurement 
distribution 5, respectively, onto the space spanned by the (orthogonalized) snapshots, namely /C„(uo,P); 
i.e., P„ is our ROM of P and b„ is our ROM of b. 

we constructed a Gaussian quadrature with respect to the weight m/o with nodes 0j e [-1,1] 


In § 


4.1 


and positive weights yjjc such that, for sufficiently smooth functions g, 

Void) 


f 9id)'^dp-, 

7 = 1 -'-I C 


(4.13) 


this Gaussian quadrature rule is exact when g is a polynomial of degree less than or equal to 2n - 1. It is 
well known that the eigenvalues and the squared first components of the (properly scaled) eigenvectors of a 
symmetric, tridiagonal matrix P„ with positive off-diagonal entries — a Jacobi matrix — are the nodes and 
weights, respectively, of a Gaussian quadrature [251 Ej. Thus our task is to construct the Jacobi matrix P„ 
with eigendecomposition 

P„X=©X, (4.14) 

where the eigenvalues of P„ are 6j and the eigenvectors X^ satisfy Xj’X^ = 5ik (where 6ik is the Kronecker 
delta symbol) and 

(efX,)2 = j;f/c. (4.15) 

The entries of the Jacobi matrix P„ are the coefficients of the three-term recurrence relation satisfied by 
the set of polynomials Vn = {qo,qi,..., qn-i}, where qk is a polynomial of degree less than or equal to k and 
the polynomials in Vn are orthonormal with respect to the weight mid)lo, i.e., 

= f ^ qtid)qkid)^^^^^ dp = S^k- 

Moreover, the Gaussian quadrature ( |4.13| ) computes the inner product with weight m/o between any two 
polynomials in this orthonormal set exactly (since qiqk is a polynomial of degree i + k < 2n- 2), so 


1 " 

= - E yh^i^3')dkidj) = S^k■ 

^ 1=1 

The Jacobi matrix P„ may be constructed via the Lanczos algorithm in Algorithm |4.1| (below), which 
is equivalent to running the three-term recurrence relation for the set of orthonormal polynomials Vn- The 
appropriate inner product is given by the normalized spectral measure 

1 " 

{P,q)y,e = -'£y^pi0j)q{9j), 

c 
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which is simply the Gaussian quadrature (4.13) applied to (which is exact for the polynomials in 

Algorithm |4.1[) . 


Algorithm 4.1 Lanczos Algorithm for Computing Oj, fij. 

Input: yj, j = l,...,n 

Output: Oj (j = 1,..., n) and jdj (j = 1,..., n - 1), i.e., the nonzero elements of P„ 
Set qrj{x) = 0 and qi{x) = 1. 

for j = 1,..., n do 

1 . aj = {qj,xqj)yg = {qj.xqj)^^/^; 

2 . r={x-aj)qj-Pj-iqj-i; 

3 - Pj = 'J{r,r)y^e = y/iryr^^; 

end for 


Finally, the Chebyshev polynomials of the first kind satisfy the three-term recursion 

Tk+i{x) = 2xTk{x)-Tk-i{x), To = l, T-i = Ti. 

This yields the following second-order finite-difference Cauchy problem for the vector e Tfc(P„)b„: 

— -w --- '" --- = ‘C(Pn)^fc, ^o = b„, ^-1 = ^1 (4.16) 


(^ is defined in (3.13)). The recursion (4.16) is the reduced-order version of the recursion (3.12); in particular, 
the n X n Jacobi matrix P„ is our ROM of the propagator P = cos (tn/A) and b„ is our ROM of the 
source/measurement distribution b. According to (3.9), for k = 0,...,2n- 1, our measurements may be 
written as fk = {{b,Uk)), where Uk satisfies (3.12). Similarly, we define the measurements for our reduced- 
order recursion in (4.16) by 

= b^Tfe(P„)b„ for fc = 0 , ..., 2 n-l. 

Then, according to (4.11), we have = fk for fc = 0,..., 2n - 1, i.e., our reduced-order model matches the 
data exactly. 

We conclude this section with the following lemma, which states that the reduced-order model matrix 
P„ is in fact the projection of P onto the space spanned by the (orthogonalized) snapshots. 


Lemma 4.5. The reduced-order model Jacobi matrix P„, constructed via Algorithm \j.l\ and the vector 
b„ = \/cei are (up to scaling for b„ j the orthogonal projections of P and b, respectively, onto the Krylov 
subspace 

ICn{uo,P) = spanjuo,... ,u„-i} = spanjui,... ,u„}. 


I.e., Pn = V*PV and b„ = -^V*b. 


Proof. The Lanczos algorithm we use to orthogonalize the snapshots, given in Lemma |3.10[ may be written 


as 


C(P)F = FaTn)+Cii?„,ie^, (4.17) 

where V = [i?i(a:),... ,i?„(a;)] (we have transformed the normalized, orthogonalized snapshots dj to spatial 
coordinates x) satisfies V*V = I„xn, ^?n+i is orthogonal to dj for j = 1,..., n, and the Jacobi matrix 


C(T„) = 


bf alj 


'^n-1 

at 


(4.18) 
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Using (3.13), (4.17) may be rewritten as 


PV=VT^+-bl,,K,iei; 


(4.19) 


(4.20) 


T„ is also a Jacobi matrix, since T„ = I„xn + (T^)- From (4.19), we have 

T„ = V*PV, 

i.e., T„ is the projection of P onto /C„(mo,F). Thus our goal is to show T„ = P„. 

The columns of the matrix Z = U^, defined in (4.9), form an orthonormal basis of ICniuo, P) — they span 
K,n{uo,P) since the columns of U span /C„(uo,T’) and $ is nonsingular, and they are mutually orthogonal 


since, by Lemma 4.3 


Z*Z = ^'^U*U^ = ln.n- 


Moreover, from (4.4) and (4.5) we have 


Z*PZ = ^^U*PU^ = = 0. (4.21) 

Now, since the columns of V and Z both form orthonormal bases of the Krylov subspace ICn{uo, P), 
there is an orthogonal matrix e K"’'" such that 

(4.22) 


V=ZQl. 


Then (4.20)-(4.22) imply 

T„ = V*PV = Q„Z*PZQl = Q„©Q^; (4.23) 

because the 9j are distinct (by Lemma 4.1), (4.23) is the unique unitary eigendecomposition of T„. In 
particular, the eigenpairs of T„ are (0j, Q„ej) for j = 1,... ,n. By (4.22) and (4.6)-(4.7), the squared first 
components of the eigenvectors of T„ are 

\ 2 / rp ^ _ ^2 


(efQ„e,) ={e{V*Ze,) = [(Uei)* C/$e,]^ 




Uei I U^Bj 


- I = 


Recalling (4.14)-(4.15), we find that the eigenvalues and squared first components of the normalized eigen¬ 
vectors of the Jacobi matrices T„ and P„ are the same. Therefore, by the uniqueness of the solution to 
the Jacobi inverse eigenvalue problem (see, e.g., the survey article |4] by Boley and Golub and references 
therein), T„ = P„; i.e., P„ = V*PV is the orthogonal projection of P onto JCn{uo,P)- 

Finally, since the columns of V are orthogonal and the first column of U is 6 (see Algorithm 3.1), we 
have, by ( |4.11 )-(4.12), 

V*b = b*bei = cBi = \/cb„. 

□ 


Remark 4.6. The result of Lemma \4.^ suggests the following alternative method for computing the reduced- 
order model P„. Proposition \5 . 1\ implies the matrix V may be constructed via Gram-Schmidt orthogonaliza- 
tion; this results in the factorization U = where 'K. 6 K"**" is an invertible, upper-triangular matrix. The 
matrix 72. may be computed via a Cholesky factorization of the known, symmetric, positive-definite matrix 
U*U because 

u*u = n^v*vn = n'^n. 


Then, by Lemma 


U*PU = n^V*PVn = 72^P„72, 


from which we obtain 

P„ = 72“^([/*PC/)72^\ 

One may also obtain P„ directly from H = (U*U)~^{U*PU) via P„ = 72H72.~^. 


Remark 4.7. We emphasize that the Gram-Schmidt procedure used to orthogonalize the snapshots respects 
causality, since each successive snapshot is orthogonalized only with respect to the previous snapshots. The 
importance of this from a physical perspective cannot be understated, since the time-domain solutions of the 
wave equation are causal — all of the linear algebraic tools we employ must respect this causality. 
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4.3 Galerkin approximation and algorithm to compute 7 ^, 7 ^ 

In the previous section, we computed the entries of the matrix P„, namely aj {j = and jdj (j = 

1,..., n - 1), from the data. Now we want to convert the set of Uj and /3j to 7j and 7^-, since 7^ and 7^ are 
localized averages of the velocity and thus give us direct information about the unknown velocity. Although 
this may be done via the formulas from Lemma 3.10 (after transforming the aj, jdj to a“, 6“ using (4.18)), we 


prefer the algorithm derived here as it gives deeper insight into the relationship between the discrete ROM 
and the continuous problem. In particular, we use renormalized versions of the orthogonalized snapshots 


Uj, Wj as the test and trial functions for a Galerkin method for the system (3.20). The coefficients of the 


Galerkin method satisfy a finite-difference recursion, and the eigenvalue problem for this recursion leads to 
an algorithm that computes 7^ and jj. For the remainder of this section, we assume that eigenvectors of 
symmetric matrices are normalized to have Euclidean norm 1. 

We begin by considering the following Galerkin approximation to Uk and Wk- 


- XI and wj^ = ^ ujj^kljWj for fc = 0,..., 2u - 1. 


(4.24) 


j=i 


j=i 


We define e P2,k,^^2,k, ■ ■ ■, Pn,k,i^n,kY■ Then 


^k 

A") 


= Qrsi^\ 


where Q is defined in (3.25) and T is dehned in (3.28). In combination with (3.23), a calculation shows that 

(4.25) 

where 


si”) E i 


fil,k+l ~ fil,k 

^l,k - ^l,k-l 

f^n,k+l ~ f^n,k 
!^n,k ~ ^n,k-l. 


(4.26) 


Recall that Uk and Wk are the solutions of (3.22). Substituting ul") and wl") into (3.22) and requiring 


the resulting equation to be orthogonal to the columns of QF with respect to the inner product 
gives the Galerkin method 

TQ* (FQFSi”) - S^QFSi”)) = 0. 

Then (3.26) (i.e., Algorithm |3.1[ ), ( |3.27 ), and ( |4.25[ ) imply this is equivalent to 


TQ* + —U2n^ie^n) rsi”) - FQ^srsf si”) = 0. 


Finally, Algorithm 3.1 implies Q*Q = T , so the above equation is equivalent to 

r-i0si”) - a^si”) = 0 for fc = 0 ,..., 2 n - 1. 


(4.27) 


The Galerkin method (4.27) is equivalent to the following finite-difference scheme for the spectral coefficients 

l-^j,ki ^j,k’ 


f-^j,k+l f-^j,k ^j-l,k ^j,k 

^j,k ~ ^j,k-l f^j-,k ~ f^j+l,k 

Pn+l,k = 0, Wo.fc = 0, 

.Mj.o = ^i,o + Wj_i = 0. 


for j = 1 ,..., n, /c = 0 ,..., 2 n - 1 , 


(4.28) 
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The boundary conditions 'pn+i,k = 0 and wo,fc = 0 are enforced to ensure that the recursions in (4.281 are 
equivalent to (4.271 for j = n and j = 1, respectively. The initial conditions pjfl = and + ^j-i = 0 

are the projections of the corresponding initial conditions from (3.20): for i= 1,... ,n we require 

n 

- Uo,%Uij^ = 0 ^ -5ii=0<> fljfi = 


i=i 


and 




— 0 ^j-,0 ^j,-i ~ 0- 


Because spanjuoi ■ • •; Wn-i} = span{?Ii,..., u„}, we have = Uk for fc = 0,..., n- 1; similarly, = Wk for 
A: = 0,..., u - 1. 

We will now derive an algorithm for computing that is based on the eigenproblem for the recursion 

(4.28). First, note (3.9) implies 


7i = {ui,ui)y^= ((uo,uo)) = {(b,b)) = /o = c, 


(4.29) 


where c is defined in (3.5) (and, hence, is known from our measurements). Next, we define = [/ri,fc,.. •, ■ 

We eliminate from the recursion (4.28) to find that Jij^ satisfies the second-order recursion 


^ ^ = M/^fc for fc = 0,...,2n-1, /Xq = ^ei, =/x^. 


(4.30) 


where M e D G, D e diag (yi,...,%), and G e M”’'" is the Jacobi matrix defined by 

[- 71 ^ 71 -^ 

71 ^ -( 71 -^+ 72 -^) ••• 


Ge 


7 n-l 


7 „-i -( 7 ji + 7 „^). 


The boundary conditions that are implicit in the definition of M (which follow from (4.28)) are 

~ n J P'0,k~P-l,k /J oi\ 

Pn+i,k = 0 and -= 0. (4-31) 

70 

Remark 4.8. The recursion ( 4.30[ )-( [4.31[ ) may also be viewed as a c enter ed-difference discretization of 
(3.14) on a staggered grid with Jlj^k = u (x ^), y^ = =4^ /jX ^jgg § A. 6 for more details, in particular 

( A.15| ) ); this matches the optimal grid discretization utilized in |5l equation (2.8)] (with a in that paper 
replaced by 1/v). 

Although M is not symmetric, it is self adjoint and negative definite with respect to the inner product 
(•,•)_, where 

n 

{x,z)~ = X^Bz=Y, XiZi%, X,Z6K”. 


2=1 


In particular, we may symmetrize M as follows: 


MeD^^^MD = d ^^^gd = 


. 1/2 


(4.32) 


We make the change of variables = D /x^ in the recursion (4.30) to find <;k satisfies 


(4.33) 


where b„ is defined in (4.16). We now prove M = ^(Pn), i-e., we prove (4.33) and (4.16) are equivalent. 
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The primary Galerkin approximation from (4.24) may be written 


■An) 

‘■k 


.1/2 


where V = [i^i,..., iJn] = [ui, ..., u„] D is constructed via the Lanczos algorithm in Lemma 3.10 Applying 


the Galerkin method to (3.17) (by inserting = V<;k into (|3.17) and multiplying on the left by V*), we 


find also satisfies the recursion (4.33|) with M = V*^ (P) V = ^ (Pn) by Lemma 4.5 Thus 7^ , 7^ may be 


computed by comparing M and ^ (Pn), the latter of which is known. In particular, recalling (3.13), (4.10), 
and (4.32), we find % = c~^ (from (4.29)), 71 = (1 - ai)7i] ^, 


I 3 = 


4/?-1^-17 _i' 


and 


I 3 = 


(l-a/)7i 


1 

72-1 J 


T-l 


for j = 2 ,...,n. 


We now present an alternative (equivalent) algorithm for computing 7^, 7^. This algorithm is a simplifi¬ 
cation and beautification of the Lanczos algorithm we have not seen in the literature, and we utilize a matrix 
version of it (Algorithm |7.2[ ) for multidimensional problems. In the interest of space, we defer its derivation 
to § A.15 in the appendix. 


Algorithm 4.2 Computation of 7j, 7_,- 
Input: yi, A; = -^{Oi) for Z= l,...,n 
Output: 7 ,-, 7 j, j= l,...,n 

Set hlo = 0 and = n/(L 5 • [j/i, yi, 1/2,2/2, ■ ■ •, 2/n, 2/n] • 

for j = 1,..., n do 

11 

j2 “ 77 T’ 

Ip(k-) EKM2) 

2=1 

2. iJj = 


U II _ ||2 2n ’ 

2=1 

4. = 77 ^ -7jLa;j . 

end for 



5 Inversion algorithm 


Algorithm 3.1 (and, equivalently, the Galerkin scheme from § 4.3) yields the averaging formulas 




/ '^max /— 1 7— 

0 W) ^dx 


and 


V = 7T 


C ^Irnax / \ ^ 

Jo {wj) vdx 


(5.1) 


Lemmas 3.10 and 3.11 imply that the weight functions Uj and Wj (up to normalization factors) can be 
computed via the Lanczos process with the operators C(P) and C(Pc), respectively, and localized initial 
conditions. 

The proposition below states that the orthogonalized snapshots Uj and Wj may be equivalently computed 
via Gram-Schmidt orthogonalization of the snapshots Uk and {7^, respectively. One of the well-known 
interpretations of the Marchenko-Krein-Gel’fand-Levitan (MKGL) method is that it is a probing via Gram- 


Schmidt orthogonalization of the triangular matrix of the snapshots (the matrix U defined in (3.10)) |39) . 
Assuming that uq = b is an approximation of a delta function, due to causality the snapshot matrix U 
will be an approximation to a triangular matrix; after Gram-Schmidt orthogonalization, the orthogonalized 
snapshots Uj and Wj will be localized functions. This is a result of the fact from linear algebra that the 
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Qi?-factorization of a full-rank, upper triangular matrix U has Q = I, where I is the identity matrix (the 
rectangular identity matrix if U is rectangular with more rows than columns). The proof of the proposition 
is given in the appendix. 


Proposition 5.1. Suppose the orthogonalized snapshots Uj and Wj are obtained via Algorithm 
denote the orthogonalized snapshot obtained via the Gram-Schmidt algorithm, i.e., 


3.1 


Letuf 




3-1 


= Wj-i - E <^^3^^ ’ c" E luj.i, -jpj 




-GSII 




» 111/ 


Then uf^ = (c?“) ^Uj, where 


^ 

^ j-i 


1 - 

i=l 

Similarly, let denote the orthogonalized snapshot obtained via the Gram-Schmidt algorithm, so 

1 




J-l / :^GS \ 

= - E where cfj = Uj-i, ..gg,, | ■ 

*=1 ' st/jt 


\w?n~' 


(5.2) 


Then = (dj) ^Wj, where 


= 

- 


Et. 

i=l 


3-1 / 
i=l \ 


(2j - 1) X - E 7* {wj-i,Wi)- Y, % 


k=l 


In addition, in slowness coordinates x, the orthogonalized snapshots Uj and Wj depend weakly on the 
velocity v for small a (assuming r is of the same order as a); moreover, Uj and Wj are asymptotically 
proportional to v and =, respectively. The weak dependence of Uj and Wj on v and the aforementioned 
asymptotic behavior of Uj and Wj can be justified via the Wentzel-Kramers-Brillouin (WKB) limit. 

We next dehne a reference velocity that is useful in our inversion scheme. 

Definition 5.2. Let v^(x) be a (smooth enough) reference velocity with u°(0) = u(0). Then the reference 
slowness (traveltime) coordinate transformation is defined by 




(X)E f 

Jo 


0 v^(x') 


dx'. 


The reference primary and dual orthogonalized snapshots and Wj and reference coefficients ^ and jj are 

withv replaced bylP (including in the definition of A). The reference coefficients 


3.1 


computed via Algorithm 

may be equivalently computed via Algorithm \4-.S\ 

To see why we require z;°(0) = u(0), note that the PDE in (2.3) is equivalent to gxx - = -u(0)^(5(a; + 

Q)6{t)t. We thus take z;°(0) = u(0) to ensure that we use the same forcing term for the true and reference 
velocity systems. 

Because Uj and Wj are localized and asymptotically proportional to v and 1/u, respectively, (5.11 implies 
that gives an estimate of 1 jv near the center of mass of while 7j gives an estimate of v near the center 
of mass of vPj. Although uj and Wj are not known a priori, as discussed above they are weakly dependent 
on the velocity. Thus the center of mass of (respectively, uj|) is well approximated by the center of mass 
of (uj)^ (respectively, (w^Y). 

Our inversion algorithm proceeds in two steps. First, we approximate the centers of mass of the squared 
orthogonalized snapshots, for j = 1,..., n, by 
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where = x^(xmax)- Next, we approximate the velocity at the preimage of the primary and dual grid 


points in (5.31 by 


and v(x ^(^)) = v(^)<^v°(^)^. 


7i 


71 


(5.4) 


^ and 7 ° correspond to dual and primary steps, respectively, of optimal grids 0/. That is, formulas (|5.3[) 


Remark 5.3. Formulas (5.3) and (5.4) will be simplified for = 1, in which case 2r = a;^. In this case 


and (5.4) are similar to the formulas for optimal grid inversion E’ except in the latter case and Xj are 


defined as cmd ’Eilili) respectively, for j = When ajr is close to ^2/4, these definitions 

can be guite close, but generally they may differ significantly, in which case (5.3) and (|5.4[ ) will give more 
accurate results than the conventional optimal grid approach. One can conjecture that (|5.3[) and (5.4) give a 


second-order approximation of smooth v with respect to the width ofuj and Wj, which can be measured asf^ 


and respectively. Generally, formulas (5.3) and (5.4) can be extended to “conventional” optimal grids. 


in which case we can also conjecture that they would produce nodal values very close to those of conventional 
optimal grids m- 

Finally, we may approximately invert the traveltime coordinate transformation to convert the traveltime 
grid nodes ^ an d 2^ to physical coordinates. In particular, since the traveltime coordinate transformation 
is given by (2.2), the inverse traveltime coordinate transformation is 


(cc) = f V (x') dx'. 

Jo 


(5.5) 


Since we only know v at the traveltime grid nodes ^ and we approximate the above integral via a 
right-endpoint Riemann sum. We obtain the following formulas for the approximate physical grid nodes, 
where we take = 0: 


$? = E (^ - 5?-i) - + E (-? -- (2?) - 5^" (^) 

2=1 


2=1 

j 


for j = l,...,n. 


(5.6) 


= E (^ - 5?-i) - + E (2? -- (2?) « 2" (2?) 

-. 2=1 2=1 

Our inversion algorithm is summarized in Algorithm |5.1[ 


Algorithm 5.1 ID Inversion Algorithm 


Input: measured data /^(fc = 0,..., 2n - 1), reference velocity 
Output: approximations of v(x~^ (^)) v{x~^ (2j)) 

1. Compute the grid nodes ^ and 2^ for j = 1,..., n. 


ttO 


a. Compute the reference primary and dual snapshots by solving (3.16) with v replaced by v 
(including in the traveltime coordinate transformation) using finite differences, for example. 

b. Orthogonalize the reference snapshots via Algorithm 4.2 to obtain and 7° for j = 

l,...,n. 

c. Compute the traveltime grid nodes and 2*( from (5.3) using the trapezoidal rule, for example. 


2. Compute c= fo and 9j, yj (j = l,...,n) using (4.8) and (4.4)-(4.7). 


3. Compute 7^, {j = I,... ,n) via Algorithm 4.2 


4. Compute the approximation of the velocity on the traveltime grid, i.e., v (xj) and v (xj), from (5.4). 

5. Approximately convert the traveltime grid nodes 2^ and 2^ to physical grid nodes Xj and Xj using 

(1^. 

6. Combine the results from steps 4 and 5 to obtain the estimate of the velocity at the (approximate) 
physical grid nodes, namely v (xj) « 2'(x®) and v (xj) « v(jxj'). 
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6 Numerical experiments 


We now present some numerical results to illustrate the main ideas of the paper. In all of our simulations, we 
used a uniform reference velocity given by v^(x) = u(0). A comparison of the performance of 2D reverse time 
migration (RTM) and a 2D backprojection method closely related to the method described in this paper 
may be found in m- 

In Figurej^a), we plot the snapshot matrix U defined in (3.10). In Figure[^b), we plot the orthogonalized 


snapshots Uj constructed using Algorithm 3.1 note the localization of the orthogonalized snapshots. In 


Figures j^a) and (b), we have scaled the snapshots so that || Uj || = || uj || = 1. The velocity we used in the 

simulation is represented by the solid, black line in Figure j^c). We mapped the grid points and to 
the spatial grid by approximately inverting the map x{x) via (5.6). The approximations to v (x~^ (^j)) 

V (x~^ (^)) 3-^® represented by blue circles and green squares, respectively. We chose a = 0.01 and r = 2.5cr 
for these simulations. At this point, we do not have a rigorous method for optimally choosing r; as mentioned 
above, we conjecture that we should choose r to be consistent with the Nyquist-Shannon sampling limit of 
gj so T ~ a. Below we will see that even certain choices of r -- cr lead to good reconstructions while other 
choices of r -- cr can lead to very poor reconstructions. As a measure of the stability of our algorithm, we 
computed the condition number of the matrix [/*[/ (see ( 3.10| ) and ( |4.4[ )). For the above parameters, we 
have cond(C/*C/) « 61.76. 

If r is too large, the inversion procedure produces poor results. Figures[^d), (e), and (f) are the analogues 
of Figures [^c), (b), and (a), respectively, in the case where r = 3.5a. The orthogonalized snapshots in 
Figure [^f) (r = 3.5cr) are not as localized as those in Figure [^b) (t = 2.5a); the quality of the inversion 
suffers as well. However, the algorithm is stable in the sense that cond(t7*C/) 13.13. 

Finally, we ran a simulation with r = 0.5cr. In this case the algorithm runs into stability issues, a problem 
heralded by the fact that cond([/*?7) rs 1.55 x 10®. 

These numerical experiments suggest that an appropriate value of r may be chosen by first selecting a 
relatively large value of r - cr and decreasing it until cond(17*[/) becomes too large. 

These results can be understood from a physical point of view. If t is too large, the wave travels too 
far between consecutive measurements, so the corresponding snapshots have disjoint supports. Since our 
method obtains the image from the projection of the propagator onto the subspace of the snapshots, if there 
are regions of the domain not covered by the supports of the snapshots there is no way for us to reconstruct 
the velocity in those regions. If r is too small, the snapshots overlap too much and become almost linearly 
dependent, which leads to a large condition number for the Gram matrix U*U. 

In Figure we plot the primary snapshots, orthogonalized primary snapshots, and inversion results for 
two additional velocity models. The first velocity model is illustrated in by the solid, black line in Figurej^c). 
We chose t = cr for this simulation. The orthogonalized snapshots in Figure |^b) are quite localized. In this 
case, cond([/*t/) ^ 4.11 x 10^. 

The second velocity model, illustrated in Figure consists of two smooth inclusions and a discontinuous 
inclusion. We chose r = 1.5 (t, which gives cond{U*U) « 28.10. 

Finally, we justify our use of the centers of mass of the reference squared orthogonalized snapshots for 


the grid points in (5.3) instead of the centers of mass of the squared orthogonalized snapshots for the true 
medium (which are unknown in practice). In Figure]^ the blue squares represent the true centers of mass 
of the primary squared orthogonalized snapshots, i.e., the height of the _)*** blue square is 




[u,{x)f 


i;(x) / Jo 


J r 21max r, 1 

-r^dx. 

0 v{x)^ 


( 6 . 1 ) 


The green circles represent the centers of mass of the primary squared orthogonalized snapshots for the 
(uniform) reference medium, i.e., the height of the j*** green circle is 




\xy 


dx. 


( 6 . 2 ) 


In practice, the map x ^ cannot be computed exactly since v is not known a priori. The red asterisks in Fig¬ 
ure [^represent the centers of mass of the reference squared orthogonalized snapshots that are approximately 
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Figure 1: In this figure, we show that the choice of r can have a large influence on the localization properties 
of the orthogonalized snapshots and the quality of the inversion, (a) The primary snapshots Uk for the 


3.1 


velocity model Illustrated in (c); (b) the orthogonalized primary snapshots Uj generated by Algorithm 
(converted to the spatial coordinate x); (c) the true velocity model (solid, black line) and inversion results for 
T = 2.5cr — the blue circles are approximately located at x~^ (^) the green squares are approximately 
located at x~^ (^)- (d) The true velocity model and inversion results when r = 3.5 ct; (e) the primary 

snapshots for the velocity model in (d); (f) the orthogonalized primary snapshots for the velocity model in 
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Snapshots (r = a) 



(a) 


Orthogonalized Snapshots (t = a) 



(b) 


True/Imaged Velocity [t = a) 



True/Imaged Velocity (r = l.Scr) 



(d) 


Snapshots (t = 1.5(t) Orthogonalized Snapshots (t = l.Scr) 




Figure 2: (a) The primary snapshots for the velocity model in (c); (b) the orthogonalized primary snapshots 
for the velocity model in (c); (c) the velocity model is drawn as a solid, black line, while the inversion 
results for T = cr are represented by the blue circles (^)) ^-^id green squares (a; ^ (5^))- (d) Another 
velocity model and inversion results; (e) the primary snapshots for the velocity model in (d); (f) the primary 
orthogonalized snapshots for the velocity model in (d). 
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converted to true coordinates using our imaged velocity from ( |5.4[ ) and a Riemann sum approximation of 

these are the grid points used in the inversion scheme 


the integral in (5.51, namely the formulas from ( |5.6[ ) 

(and are those shown in Figures [^c) and (d) and Figures [^c) and (d)). In particular, Figure a) corre¬ 
sponds to the velocity model in Figure [^c), Figure |^b) corresponds to the velocity model in Figure [^d), 
Figure [^c) corresponds to the velocity model in Figure [^c), and Figure |^d) corresponds to the velocity 
model in Figure [^d). We note that the centers of mass agree quite well (to within a few percent or less) if 
r is chosen appropriately (as in Figuresa), (b), and (d)), while they differ significantly (around 28%) if r 
is chosen poorly (Figure |^b)). There are even certain choices of r for the velocity model in Figure [^b) for 
which the grid points are not monotonically increasing — in particular, the orthogonal snapshots have large 
values far away from the peak centered near the “optimal” grid point, which leads to a poor approximation 
of the true center of mass. 


7 Extension to two dimensions 

In this section, we extend our results to two dimensions. Because the majority of the results from the one¬ 
dimensional case carry over without signihcant modifications, we will keep our discussion relatively brief. 


7.1 Multi-input/multi-output formulation 


We begin by dehning the region ft = [OjXmax] x [-J/max,2/max], where we typically take i/max = We place 
m sources at the points (0,?/*) for f = 1,..., to, which leads us to consider the following Cauchy problem on 
n X [0, oo[: 

Au'' + ult = 0, u\t=o = q{A)S(x + 0)S{y-y'-),ul\t=o = 0, (7.1) 


where we take q is as in (2.81, and 



(7.2) 


together with the boundary conditions 


u 


2/—iymax 


= 0 , 


Uxlx=0 


=0 — 0, U \x=x 


= 0 . 


We assume ?;(0, y) = ?;(0,0) for y 6 [-i/max, 2/max]- 

For simplicity, we place the receivers at the same locations as the sources. Then, for /c = 0,..., 2n - 1, 
we organize our measurements in a matrix = u®(0,2/%A:t). This is the square multi- 

input/multi-output (square MIMO) problem in control theory terminology. 


7.2 MIMO reduced-order model in block form 

For * = 1,..., TO, let M* be the solution to the following Cauchy problem on fl x [0, oo[: 

Au^ + ult = 0, u^\t=o = b\ ul\t=o = 0, (7.3) 

where 

b\x,y) = v{0,0)q{Ay/^d{x + 0)d{y - y^). (7.4) 

For fc = 0,..., 2n - 1, we define the snapshots 

Uk^[ul...,uT] = n{P)B, (7.5) 

where P = cos (r\/(4) is the propagation operator, B = [6^,..., 6™], and 

ul = u\x, y, kr) = Tk{P)b\ 


Then the measurement matrix 


Ffc = U*Uk = B*mP)B, 


( 7 . 6 ) 
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Index 

(a) 


Primary Center of Mass 



10 20 

Index 

(b) 



Figure 3: In this figure, we plot the centers of mass and approximate centers of mass for (a) the velocity 
model from Figure [^c); (b) the velocity model from Figure [^d) — the disagreement between the various 
centers of mass in this figure arises because the orthogonalized snapshots are not well localized (because we 
chose T to be too large — see Figure [^d)-(f)); (c) the velocity model from Figure [^c); (d) the velocity 
model from Figure [^d). 
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{x,y) 


where * is defined as before (see ( 3.11[ )) with the inner product 

{{u, w)) = u{x, y)w{x, y) 

The measurement matrix can also be represented by 

Ffc = Tk{y)r]Q{p) dp, 


dxdy. 


(7.7) 


where T 7 q is an m x m matrix measure. In particular, (p) = ff{Q,y^,p) where Tf{x,y,p') is defined as in 
(3.3) with p replaced by 


p\x,y,\) = 


(7.8) 


Z=1 


(A/, 2 /) are eigenpairs of A with {{zi^Zj} = Sij. Next we construct a generalized Gaussian quadrature such 
that 

Tk{p)Voiy)dp=Z^jTk{&,)Yj = Fk forfc = 0,...,2n-l, 

1 1=1 


(7.9) 


mxm 
mxm 


i.e., for I = 1,... ,TO, '^ij € K"* is the column of the matrix Yj, and 


where ,...,e 

&j — diag (^ij j - ■ ■: dyyij ) 6 

We define the snapshot matrix U = [Ui,... ,Un]- Then matrix versions of Lemmas 4.2-4.4 hold. In 
particular, if 


11={U*U)-\U*PU)€ 


^mnxmn 


and El= 


mxm 

^mxm 


then 


Fk = F^{U*U)Tk{Ii)F,. 
Additionally, H has the eigendecomposition 

Fl=^®^^U*U, 


(7.10) 


(7.11) 


where 0, $ e such that = Imnxmn- We emphasize H is known by the matrix version 

of Lemma 4.4 (with fk replaced by in the statement and proof of the lemma). Substituting (7.11) into 
(|7.10|) gives 

Fk = X^Tk{&)x for fc = 0,...,2n-l, where x = 


Comparing this with (|7.9|) gives 


diag Qj = © and x = 


Y 


Y 


■Tn 


(7.12) 


(7.13) 


We may also construct a symmetric, positive-semidefinite matrix C e K™xm ^ symmetric, block- 
tridiagonal matrix 


P. = 


OLl (3^ 

fdi 0L2 


(^n-1 


€-1 


= p: 6] 


(7.14) 


with ctj = aj € and /3 ■ 6 such that 


n 

Ffc = ^ Y^Tk{@,)Yj = Ci/2E^Tfe(P„)EiCi/2 for fc = 0,..., 2n - 1. 


(7.15) 
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Taking k = 0 in (7.151 gives 


n 

C=^Y,Yj = Fo=/ Voik^)dp. 

7 = 1 


(7.16) 


From (7.16), we immediately see that C is symmetric and positive-semidefinite — C will be positive-definite 
if and only if the matrix [Yi,..., Yji] = [’d'n,..., ^mn] s has rank equal to m. 

Analogously to the ID case, the matrix P„ has the eigendecomposition 


P„X = X©, X=[Xi,...,X„]e] 


omnxmn 


X/X,=5,,I 


7_7-*-mxm 


(7.17) 


(the matrices X^- e . 


are “block eigenvectors” of P„ corresponding to the “block eigenvalues” &j). 


We compare (7.15) with (7.12)-(7.13) to find E( Xj = C j. Because ( |7.17| ) is equi valen t to 0X^ = 


X“* P„, the matrix P„ may be constructed using the block-Lanczos algorithm in Algorithm 7.1 (the mn x m 


Lanczos “vectors” in this algorithm are the “columns” of the matrix X , i.e., X = [Q^,..., Q„]) — see, 
e.g., the book by Parlett 00]. 


Algorithm 7.1 Block Lanczos Algorithm for Computing OLj, j3j. 


Input: C, 0^-, Yj for j = 1,..., n 
Output: Oij (j = 1,..., n) and (3j (j = 1, 
Set Qq = Hmnxm and 


. ,n - 1), i.e., the nonzero elements of P„ 


Qi 




for j = 1,..., n do 

1. aj = Qj0Q^; 

2. Rj = 0Qj - Qj 

3. (3^ = 

4. Qj+i = ■ 

end for 


7.3 Continuum interpretation in two dimensions 

We now derive an inversion algorithm analogous to the algorithm we constructed in § [^ The key ingredients 
are the matrix extensions of 7 j and 7 ^; in particular, we now consider m x m symmetric, positive-definite 
matrices Fj, F^- for j = These matrices may be computed via Algorithm 7.2 [HJIID], which is 


a matrix version of Algorithm |4.2| We conjecture that full rank of the Gram matrix U*U is a sufficient 
condition for Algorithm |7.2| to succeed. 
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Algorithm 7.2 Computation of Fj, Tj 
Input: C, Y,, A; = -^(0,) for /= 

Output: Fj, Fj, j = 1, ... ,n 

Set uJo = 02 mn.m, Ml = TOS • [Yi, Yi, Y2, Y2, . . . , Y„, Y„]^ 6 
L = diag (A^^^ -AJ^^ A2 ^^..., A^/^ -A]''^) e M2mnx2mn^ 

for j = 1 ,..., n do 

1. Fj = (pj/Xj) ; 

3. Fj = (wj u)j) ; 

4. -LdJjFj. 
end for 


Remark 7.1. In what follows, we illustrate one way in which the inversion algorithm from may be 
extended to 2D. In particular, we avoid technical details and focus on providing an heuristic justification of 
our algorithm. We have recently developed a more rigorous 2D inversion algorithm that relies on many 
of the ideas discussed in the present paper. 

In 2D, an invertible coordinate transformation to slowness (also known as ray or traveltime) coordinates 
analogous to (2.2) may not exist for most relevant cases due to the formation of caustics [42]. If the medium 


under consideration is “approximately layered” in the vertical direction, however, it is plausible that an 
invertible transformation to ray coordinates exists. 

Henceforth we assume that rays perpendicular to the line a: = 0 do not intersect. This ensures that the 
ray coordinate transformation {x, y) (C, v) exists and is invertible. Here C represents the traveltime along 
a ray and v is orthogonal to we also assume the line cc = 0 is mapped to the line C = 0. Thus the curves 
V = const, represent the rays orthogonal to the line x = 0. We define vI{C„u) = u‘{x{f,v),y{C,,v)) (and 
similarly for other functions of x and y). Then (7.3) transforms to 


Avf + u!^^ = 0, u\t=o = b, ut\t=o = 0, 


(7.18) 


where A is the operator A represented in ray coordinates 
medium, we approximate A along rays by 


In particular, in an “approximately layered” 


~ „ 5 (ldu\ 


(7.19) 


thus our problem essentially reduces to a ID problem (in a layered medium, we have v = y and C, = 
Jq llv(x') dx', as in ID). 

As in the ID setting, we consider the first n primary and dual snapshots, namely Uk and Wk, respectively 
{k = 0, ...,n- 1). The orthogonalized snapshots Uj and Wj (computed via an algorithm analogous to 


Algorithm 3.1) will again be localized in some sense. Moreover, we have Tj = {UjUj)~^ and Fj = {WjWj)~^ 
(where * is defined as in § 4.1 with respect to an appropriate inner product), so Fj and Tj are symmetric, 
positive-semidefinite matrices. The matrices F^-, Fj may be loosely interpreted to contain information about 
the local wave speed as follows. 

As in da HD], we use the diagonals of F^ and Fj, denoted by 7 ^ 6 K™ and 7 ^ 6 K™, respectively, as the 
analogues of 7 ^ and 7 ^ from ( |5.4| ). The reasoning behind our use of the diagonals is twofold. First, the set of 
data matrices Fk (fc = 0,..., 2n - 1) is effectively three-dimensional, as is the set of Fj, Fj. Our problem is 
overdetermined because we are trying to recover an approximation of u on a two-dimensional grid. Although 
we use the full data to compute Fj and F, we reduce the dimensionality by using only the diagonals jj and 

Aj dOj- Second, recall F^-^ = (UjUj). Then efjj = TfT = U'^j- Since the approximate operator A in 


are related to localized averages of =. 


( |7.19| ) is of the same form as in the ID case (see (3.14)), it seems reasonable to assume the quantities e“ 7 ^ 
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Our algorithm thus proceeds as follows. We consider a background velocity with v^(0,y) = v(0,0) 
for y € [-j/max)2/max]- Wc choosc the velocity to be simple enough so the ray coordinate transformation 
is well-posed; for example, we took = u(0,0) in our numerical experiments. For a constant background 
velocity, ray coordinates are particularly simple — in fact, = :^x and v = y. 

The grid points at which we approximate the true velocity are v^) and (^,'V^), where, for a constant 
background velocity is computed as in the ID case and ly^ = y^. The dual grid points are 

defined in a similar manner. The velocity is approximated at the grid points (in ray coordinates) by 




T~0 

e, 7., 




1 1 '' I / T--^ 

e/ 7 ■ 


and 


7? 
I ' .1 


(7.20) 


We may approximately convert the grid points and to spatial coordinates (x°, 2 /°) and 

(■J^,^), respectively, by inverting the coordinate transform using our imaged velocity (much like in the ID 
case). 

In Figure we plot the results of two numerical experiments using our 2D inversion method. In both 
cases, we used a constant background velocity. Figure |^a) is the image of a block — the true velocity 
corresponding to Figure]^ a) is plotted in Figure |^b); Figure [4|(c) is the image of a dipping interface — the 
true velocity corresponding to Figure |^c) is plotted in Figurej^d). In Figures |^a) and (c), the horizontal 
axis is in slowness coordinates, while in Figures Qb) and (d) the horizontal axis is in physical coordinates. 
They show qualitatively correct inversion results, even though our assumption on nonintersecting rays fails 
for the block model. 

The above imaging procedure was further improved of by some of the authors with preliminary results 
(including imaging of a 2D Marmousi cross-section) reported in |37) . 


8 Conclusion 

We developed a model reduction framework for the solution of inverse hyperbolic problems. This is a brief 
summary of our approach. 

• We start with a one-dimensional problem and single-input/single output (SISO) time-domain boundary 
measurement s. 

— We sample the data on a given temporal interval consistent with the Nyquist-Shannon theorem 
and construct the ROM interpolating the data at the sampling points. The ROM is obtained via 
the Chebyshev moment problem, which can be equivalently represented via Galerkin projection 
on the subspace of the wavefield snapshots, i.e., a Krylov subspace of the propagation operator. 

— Using the Lanczos algorithm, we transform the projected system to a sparse form that mimics a 
finite-difference discretization of the underlying wave problem. This transformation is equivalent 
to Gram-Schmidt orthogonalization, and yields a localized orthogonal basis on the snapshot 
subspace. 

— We estimate the unknown PDF coefficient via coefficients of the sparse system. The coefficients of 
this sparse system are weighted averages of the true, unknown velocity, where the weight functions 
are localized (in particular, they are the squared orthogonalized snapshots). 

— Numerical experiments show quantitatively good images of layered media, though the image 
quality depends on the consistency between the time-sampling and the pulse spectral content. 

• We outline a generalization to the multidimensional setting (on a 2D example) with square multi¬ 
input/multi-output (MIMO) boundary data. 

— We construct the MIMO ROM data via the block-counterpart of the SISO algorithm. 

— The continuum interpretation of the MIMO ROM is done via geometrical optics. 

— Two-dimensional numerical experiments show that the imaging algorithm gives qualitatively cor¬ 
rect results even when the geometric optics assumption does not hold. 
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Figure 4: In this figure, we plot the results of two numerical experiments using our 2D inversion method 
sketched in this section. The horizontal axis is in slowness coordinates for figures (a) and (c) while it is in true 
(physical) coordinates for figures (b) and (d). We used a constant background velocity in both experiments: 
(a) image of a block inclusion; (b) true velocity corresponding to (a); (c) image of a dipping interface; (d) 
true velocity corresponding to (d). 


The key of the efficiency of the proposed approach is the weak dependence of the orthogonalized snapshots 
on the media, which allows us to use a single background Krylov basis for accurate Galerkin projection. At 
the moment we only have experimental verification of that phenomenon, and can conjecture a result similar 
to the asymptotic independence of the optimal grids on variable coefficients [S]. We believe that such a 
basis can also be found for interpolatory model reduction in the frequency domain (via a rational Krylov 
subspace), and investigation in this direction is under way. 

Another advantage of our proposed algorithm over traditional FWI is that modeling errors are not an 
issue; because we use a homogeneous background wavespeed, the background solution can be obtained 
analytically. We should mention, however, that our algorithm does not perform well if a nonconstant 
background wavespeed that is not very close to the true wavespeed is used. Although this limits the ability 
of our algorithm to be used iteratively, we believe our algorithm performs very well either as a direct 
algorithm or as a nonlinear preconditioner for generating an initial model for FWI [lOj . Additionally, in § |^ 
we discussed the stability of the ID algorithm in the context of the choice of the sampling stepsize t, which 
essentially plays the role of a regularization parameter. 

We must admit that the generalization to multidimensional problems is still in its initial stage. The 
square MIMO formulation is overdetermined; this gives rise to a multitude of different imaging formulas, 
even though the equivalent state-variable ROM representation is unique up to a change of basis. One such 
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formula, outlined in [37] (still based on the MIMO ROM construction presented in this paper), apparently 
has sharper resolution than the algorithm of § 7.3 We also discovered some stability issues in the 2D case 


— these will be addressed in a forthcoming work. 

Moreover, the collocated square MIMO formulation considered in this work may not be suitable for some 
practically important measurement systems in seismic exploration and other remote sensing applications. To 
circumvent this deficiency, we are looking at the extension of our approach to non-collocated source-receiver 
arrays with a different number of sources and receivers, which leads to rectangular MIMO formulations within 
the Galerkin-Petrov projection framework. Another possible extension is a back-scattering formulation used 
for radar imaging, corresponding to one or a few diagonals of the square MIMO matrix data set. 


Acknowledgments 


The authors wish to thank Olga Podgornova and Fadil Santosa for helpful discussions related to the topics 
presented in this paper. 

A Proofs 

In this appendix, we present some calculations and proofs we omitted in the body of the paper. 


A.l Derivation of (3.2) 


We begin by recalling (2.12): 


oo 

cos(kTs)p(x, s^)sg(s^) ds. 

0 

We make the change of variables y = ts in ( jA.lj ) to obtain 

2 r °° 

u{x,kT) = — cos{ky)p{x,{ylTf)yq{{ylTf-)dy. 
Henceforth we will take the principal branch of arccos, namely arccos : [-1,1] i->- [0, tt]. 


(A.l) 


(A.2) 


Next, we break the integral in (A.2) into infinitely many segments so we can apply an invertible change 


of coordinates of the form p = cosy to each segment; in particular, we have 

2 °° r-(2j+l)-K 

u(x, kT) = — Y, / cos{ky)p [x, {ylTY)yq{{ylTY) dy 


2 

+ — 


Y [ cos{ky)p{x,{yjTf)yq{{yjTf) dy. (A.3) 

j=l 


We now make the following changes of variables in the first and second integrals in (A.3), respectively: 
p = cos{y), 2 /= arccos(^) + 2j7r, dy =- 

V = cos{y), y = - arccos(i^) 2j7r, dy = 



(A.4a) 

■d.i2. 

\/l- 12 '^ 

(A.4b) 
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Using (A.4a) and (A.4b) in the first and second integrals in (A.3), respectively, we obtain 

'arccos(/r) + 2j7r' 


3 = 0 ' 

(arccos(^) + 2j7r)5'| 


2 I 

u{x,kT) = — 'y' / cos(fc(arccos(/i) + 2j7r))p| X, 

_||arccos(^) + 2j7rj^j ^ 


1 ) 


1 




dp 




IL cos(fc(- arccos(z^) + 2j7r) )p I x, 


i=i- 

• (- arccos(i/) + 2jTT)q\ 


- arccos(i/) + 2j7r 


- arccos(i/) + 2j7r 




: dv. 


We then use the 27r-periodicity of cosine, transform j -> -j in the second sum, and use the definition of the 
Chebyshev polynomials of the first kind to find 


2 r ^ / 

2{x,kT) = — Y, / Tk{p)p\x, 

T 3=0-^-^ \ 

■ (arccos(^) + 2j'K)q 

E f Tk{p)plx 

j = -oo -'1 \ 

• (arccos(^) + 2j7r)g 
Tk(p)r](x,p)dp. 


arccos(^) + 2j 


(( 


,).j 


arccos(^) + 2j7r 




dp 


2 

72 


arccos(/i) + 2jTr 


)■) 


(( 


arccos(^) + 2j7r 




: dp 


4.1 


A.2 Support of c ^riQ(fi)dfi for constant velocity 

For s 6 [-1,1], let us define N{s) = c~^r]o{fJ‘) dp. Then the hypothesis of Lemma 

least n points of increase for s e [-1,1] (the set of all points of increase for N is also 
or spectrum of the measure c~^po{p) dp — see, e.g.. Chapter 1 of [13]). Here, we show that if the wavespeed 
is constant, then N has exactly n points of increase in [-1,1]; we provide a qualitative explanation for the 
nonconstant wavespeed case at the end of the section. 


holds if N has at 


cnown as the support 


Recall from (3.3) and (2.6) that 


vo{p) = v{o,p)= E 


3 =- 


1=1 


(arccos(/r) + 27rj)^ 


A; 


zijO) 

u(0)2 


zi{x), 


th 


where {Xi,zi{x)) is the I eigenpair of A and 

(arccos(/r) + 2j7r)^ \ arccos(/r) + 2j7r 1 


rj{p) = 2sgn(j> 






p^ 


Note that if q'ls given by (2.9), then Xj > 0 for /r e] - 1,1[. 

For simplicity, we take the wavespeed v = 1 and Xmax = 1- Recall that we take 2n measurements on 
the time interval [0, T] at the discrete times kr for fc = 0,..., 2n - 1, where (2n - l)r = T. For the sake of 
illustration, let us take t = 1/n for some n; then T = 2 - - and the timestep t is determined by the number 
of snapshots we wish to take. 

When V = 1, the eigenfunctions of A satisfy 


^Pxx — XlZlj 


A.x\x=0 — 0; ■2^z|a;=l — 0- 
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Then the eigenvalues and (orthonormal) eigenfunctions are 

t2 


A/ = 


(21-1)7 


and zi{x) = \/2 cos , 1 = 1 , 2 ,.... 


(A.5) 


Reversing the arguments from § A.l using the fact that r = 1/n, and using the expressions in (A.51 for the 
eigenfunctions of A we find 


N{s) = 4c 


tj: 

°° ri^Tzi- 

I / 

,tl ->(27-1) 


(2j' + l)7rn 
j_Q w/(arccos(s) + 27rj)n 
{27zj- arccos (s)) n 
-l)7zn 


q{r'^)r^S{r^ - Xi)dr 


1=1 


q(r^)rJ^S(r'^ - A;) dr 


1=1 


(A.6) 


For a fixed value of s e [-1,1], only certain values of Xi will contribute to the above integrals. In Figure 
we illustrate the first couple of integration intervals for the first (second) integral in (A.6) in red (blue) for a 
given value of s. The square roots of the eigenvalues from ( A.5[ ) are marked with crosses — here n = 6. As 
s ->• 1“, the intervals will increase in width until the positive half of the real line is covered. 

n arccos(s) 2n7r — n arccos(s) 2mT n arccos(s) Ann — n arccos(s) 

| x X X X X x | x X X X X x | x X X X X x | x X X X X x | x X X 


nn 


2nn 


3nn 


Ann 


Figure 5: In this figure, we sketch the integration intervals for the integrals in (A.6) for a fixed value of 


s € [-1,1]. The square roots of the eigenvalues of A, namely \/Xi, are marked with crosses. The red intervals 
in the figure correspond to the first two integration intervals for the first integral (i.e., for j = 0, 1) while the 
blue intervals correspond to the first two integration intervals for the second integral (i.e., for j = 1, 2). We 
took n = 6 for this figure. 


From (A.5), we see that \/Xq €]0,n7r[ for Z = 1,...,n. As s increases from -1 to 1 more and more 


eigenvalues will be caught in the integration intervals for the integrals in (A.6) and contribute to N{s). 
Since the eigenvalues of A are uniformly distributed on the positive real line when v is constant, each 
integration interval will contain the same number of eigenvalues as every other integration interval for every 
value of s, and each interval will contain at most n eigenvalues. 

For 7 = 1,..., n, let Si 6 [-1,1] be defined by 


n arccos 


(sj) - -^A 


n-(i-l) ^ Si = cos 


\/An- 


(*-l) 


Using this and (A.6) we obtain 


N{s) = Y,OiiH{s- Si), 

i=l 

where H is the Heaviside step function and 


(A.7) 


7=0 


A„_(j_i) + 27rnj) j + 27rnj) 


- Y. d ^(\/A„+j + 2nnj^ j (\/A„+i + 2nnj^ . 


(For our choice of q in (2.9), both of the above series converge.) The upshot of (A.7) is that the n points of 


increase of N{.s) are given by Si. 

Finally, if the wavespeed v is not constant, then the eigenvalues of A will typically not be uniformly 
distributed along the positive real line as in they are Figure]^ (when v is constant). In contrast, the most 
probable scenario is that the spectrum of A is irregularly distributed along the positive real line in which 


case the summation in (A.6) gives an infinite number of points of increase for N{s). 
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A.3 Proof of Lemma 13.21 


Because the slowness coordinate transformation is given by (2.2), the chain rule implies 

d^u 1 d / 1 du\ 


du du dx 1 du ^ 

dx dx dx V dx 


dx'^ V dx \v dx / 


Using this and Uttix,t) = utt{x,t) in (3.6) gives 


^d (I du\ ^ ^ 

The boundary conditions follow from the above calculations, the identity af(0) = 0, and the definition 

3^(3^max) ~ ^max- 

The initial condition Ut\t=o = 0 holds for u since we are not making any coordinate transformations in 
time. The derivation of b requires some care. First, we note that ii u, w e T^[0,a;i„ax]! then u(x) = u{x{x)) 
and w{x) = w{x{x)) e L^[0,afmax] and 

{u,w)y^2 = {u,w)y^. (A.8) 

In terms of distributions, for functions h that are (right) continuous at a; = 0, we have 




(A.9) 


In light of dASt and ( |A9t , the transformation of the distribution (5(a: + 0) to slowness coordinates, denoted 
5{x + 0) (since x(0) = 0), should satisfy 


{5{x + Q),h) = 


m 


lA ^2(0) ■ 


We take 5{x + 0) = =^(5(x + 0); then 


{5{x + ^ 


h(0) h(0) 


. (0) ^;2(0) 


because a:(0)_= 0. Thus b = v{0)q{Ay/^d{x + 0) transforms to 6 = q{Ay/^S{x + 0). 

Finally, A is self adjoint and positive definite with respect to {■,-)iiv thanks to (A.8) and the facts that 
A is self adjoint and positive definite with respect to {■,-)ii^ 2 - 

A.4 Proof of Lemma 13.41 


Suppose u and w solve (3.16). We prove that u solves (3.14); the proof that lu solves (3.15) is similar. 

We differentiate the first PDF in (3.16) with respect to t and the second with respect to x and subtract 
the results to find ^ 

~ ■ ■ =w^^- wtx = 0 . 


zUtt 


(A) ^ 

\V lx 


Multiplying both sides of the above identity by v gives Au + Utt = 0, as in (3.14). 

The boundary condition u\x=x^^^ = 0 follows immediately from (3.16); we differentiate the boundary 
condition w\x=o with respect to t and use the second PDF in (3.16) to find 0 = Wt\x=o = {=Ux)\x=o, which 
implies Ux\x=o = 0. We follow a similar procedure for the initial conditions; u\t=o = b is trivial. We differentiate 
the initial condition iu\t=o = 0 with respect to x and use the first PDF in (3.16) to find 0 = Wx\t=o = (4^t) |t= 0 ! 
so Mt|t=0 = 0. 































Inversion via projection-based model reduction 


38 


A.5 Proof of Lemma 13.61 


We have already essentially proved the first part of this lemma (see (3.8) and (3.12)). 


To prove the second part of the lemma, we begin by noting that the solution to (3.15) is 


i{x,t) = sinlt\/c]c 

' ' V ox 


Then Definition 3.5 implies 


Wk = w{x, (k + 0.5)t) = sin + 0.5)r'ycj C 


1 db 
V dx^ 


(A.IO) 


and, in particular. 


Wo = w 


(x,0.5t) = sin^O.Sr'ycj C 


1 db 
V dx 


Thus we need to show (3.18) and (A.IO) are equivalent, i.e. 


Tf) (cos (r) + Ti^_\ 1 cos ( rV Pc ] 




sin(o.5r\/ffcj C 


V ox 


1 db 


= sin((fc + 0.5) 

This means we must prove 

[^r^^^(cosa;) + r^^|(cosa;)j sin(0.5x) = sin((fc + 0.5)a:). 

The well-known identities 

T^^\x) = 2 Y, Tk{x) (j odd) and T^'^\x) = 2 Y Tk{x)-l (j even), (A.12) 

k=l k=0 

k odd k even 


(A.ll) 


together with Tj{cos{x)) = cos{jx), imply (A.ll) is equivalent to 


k k 

2 Y cos{jx) + 2 Y cos{jx) - 1 

j=l j=0 

j odd j even 


sin(0.5a:) = sin((fc + 0.5)a;). 


(A.13) 


We will use induction to prove (A.13) is an identity. The case k = 0 follows immediately. For the induction 


step, suppose (A.13) holds; we will prove it also holds with k replaced by fc + 1. 
We have 


sin((/c + 1.5)a;) = sin((fc + l)a;) cos(0.5a;) + cos((A: + l)a;) sin(0.5a;) 

= 0.5 [sin((fc + 1.5)a;) + sin((fc + 0.5)a:)] + cos((fc + l)x) sin(0.5a:); 

solving the above equation for sin((fc + 1.5)a;) yields 

sin((A: + 1.5)a:) = sin((/c + 0.5)a;) + 2cos((fc + l)x) sin(0.5a:). 

This and the induction hypothesis ( A.13[) imply 


sin((fc + 1.5)a:) = 


fc+i 


fc+i 


2 Y cos{jx) + 2 ^ cos(ja;) - 1 
j=o 

j even 


1=1 
j odd 


sin(0.5a;). 


as required. 
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■ ■ ■ ('2') 

Finally, the recursion (3.19) follows from (3.18) because the Chebyshev polynomials satisfy (x) = 

( 2 ') ( 2 ") 

2xT^ ^{x) - T^_[{x) and (where all Chebyshev polynomials are evaluated at Pc) 


_ _ _ /^(2) , Tn(2) ^ Q [rpi^) rp{2) 1 rp{2) rp(2) ^ ~ 

Wk+l-2Wk-^Wk-l {-^k+l'^^k ^[^k ^fc-1 ''fc-2/^0 


[ 7 ^( 2 ) _ 7 - 1 ' 
M/c+1 ^k 


( 2 ) _ rp(2) 

k-1 ' ^k-2 


T}:\ + zco 


[2PcT, 


( 2 ) _ rp{2) _ rTi{2) _ rp{2) _ rp{2) 

k k-1 ^k k-1 ^k 


2PcTtl\] 


Wo 


= i{Pc)wk- 

The initial condition {cq + tc-i = 0 can be derived from ( 3.18[ ): 

wo + 51-1 = {Pc) + (Pc)] wo + (Pc) + (Pc)] wo = 0 

because = 1, = 0, and = -1. 

A.6 Proof of Lemma 13.71 


In order to avoid getting too involved in technical details, we present a proof of Lemma 3.7 in a discrete 
setting. In particular, we discretize the differential operators involved in the proof using finite differences. 
This allows us to circumvent the technicalities involved in specifying the domains of the differential operators 
in question, although, as we will see, the discrete operators still retain information about these domains. 
Moreover, this proof highlights many of the details of numerical simulations. 

We discretize on a staggered grid, illustrated in Figure The m + 1 “primary” nodes are 

indicated by the symbol o and the to + 1 “dual” nodes {x^}JIo are indicated by the symbol x. We take 
TO » 1 to ensure that the continuous operators are well approximated by the discrete operators. In practice, 
we use a uniform grid with hj = h for j = 1,... ,to, hi = /i/2, and hj = h for j = 2,... ,to. However, it is 
convenient for our purposes to keep the grid steps arbitrary for now (as long as the primary and dual grid 
points alternate). 






/y.0 /y. 1 






X 


m 


- 0 - 


- 0 - 


- 0 - 


- 0 - 


- 0 - 


{) — X 


1 


X 


x-^ 


/yj Tl 


X 


-0 

m+1 


X 


max 




P 


Figure 6: In this figure, we sketch the staggered grid we use to construct finite-difference approximations of 
differential operators. The “primary” nodes {x^CC are indicated by the symbol o and the “dual” nodes 
{x^}jlo indicated by the symbol x. 


Recall that the operator A is defined by 

where Ux\x=o = 0 and 

\v ox / 


~ ^ d ( I du\ 

Au = -V— I - — 
ox 


— 0 - 


(A.I4) 


















Inversion via projection-based model reduction 


40 


Using centered differences, we discretize u on the primary nodes and Ux on the dual nodes to obtain m 

(A.15) 


Au{x^) ^ 

h-J Liv ox W ox 


yj 

hi 


( ui^^ -ui\ ( ui - ^ \1 

-==- - -==- for j = 2,..., TO, 

[\ IP hi I \vi-^hi-^ 


where pi = u{xi^ for j = 1, ..., to + 1, iP is an approximation to rf(aP) for j = 1, ... ,to + 1, and tP is an 
approximation to v{xi) for j = 0, ...,to. For example, if v is continuous, we may take zP = v(xi') and 
tP = v{xiy If V is not continuous, we may follow [S] and take 


1 

pi 


J{x) 


dx 


1 1 

hi Jxi-^ v(a 

(so pi is the harmonic mean of zi on and 

1 

zP s — / z;(a:) dS 

hi Jxj 

(so pi is the arithmetic mean of z; on (y, 

We discretize the Dirichlet boundary condition P\x^^^ = 0 by setting = 0. To handle the Neumann 
boundary condition at a; = 0, we introduce a “ghost node” at aP = -hP. Then, for j = 1, (A.15) is 

^1 


Azt(a;^) Ri 


P^-P^\ / zli-zz° \ 

Ph^ ) \ zP/i° / 

We discretize the Neumann boundary condition Ux\x=o = 0 by setting]^ 




= 0 . 


(A.16) 


In summary, we define u = [zt^,... ,zl™] € M™ (where we have implicitly taken zT"^^ = 0); then Au{xi) 

{Au)i for J = 1,... ,TO, where we define the following matrices in 


A = RS, ReVA, SeV^A, VEdiag (z;\...,zP"), V e diag (zl\, 
A E diag {l/h \..., l/h'")T, A e diag {l/h \..., l/h'")T'^, 




(A.17) 


and T is the m x m Toeplitz matrix with 1 on the main diagonal, -1 on the subdiagonal, and 0 elsewhere. 
Finally, A is self adjoint and positive definite with respect to the inner product 


a=i 


if f and g are viewed as primary-grid discretizations of functions / and Tj satisfying the boundary conditions 
in (A.14), then this discrete inner product is the midpoint-rule approximation of the inner product 

Here and throughout the remainder of this section, bold, lowercase Latin letters adorned with ~ or 
^ denote vectors in K™ that correspond to discretizations of functions on the primary grid or dual grid, 
respectively. In particular, the discretized versions of the primary and dual snapshots are denoted by 


_ ^-|(P rp 

Ufc = [Wfe, • ■ ■ ,wrj = [Uk{xi), ■ ■ .,Uk{x,n)] 

^For smooth v and uniform grid steps - h for j = 1,... ,m, - h/2, and hj = h for j = 2,... ,m, | |A.15| | is an 0{h^) 

approximation of A. An equivalent formulation arises by taking ^ = -h^/2 (instead of ^ = 0) and discretizing the Neumann 
boundary condition by du/dx{'^) + du/dx{x^) ss 0, which, in the uniform grid case, is an 0{h^) approximation to du/dx{0) = 0. 
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and 


r 1 1 ^ T 

\Vk=[Wf,,...,W^\ =[wk{xi),...,Wk{Xjn)] , 

respectively. Similarly, bold, uppercase Greek or Latin letters adorned with ~ or ^ denote m x m matrices 
that act on functions discretized on the primary and dual grids, respectively. For example, let us consider 
the matrix S = V A. The matrix A acts on the discretized snapshot to produce the vector Au^, 
which is an approximation of on the dual grid (because, as discussed above, we discretize |= on the dual 
grid). Since vector Auj, is a discretization of a function on the dual grid, it can be acted on by the matrix 
V . In summary, matrices with ~ (respectively, ~) act on vectors with ~ (respectively, ~); this notation 
allows us to retain information about the domains of the continuous differential operators in the discrete 
setting. 

We now focus on the discretization of the dual operator C: 


1 d (^dw\ 


where wy=o = 0 and Wx\x=x^^^ = 0. 


(A.18) 


For j = 0,... ,m, we denote = w{x^). Analogously to what we did before, we discretize w on the dual 
nodes and Wx on the primary nodes to arrive at 


Cwijx^) Ri -- 


1 

^■+1 

ox 

3d hi 


1 


- 

Id hi 

[ hAi 


, ^dw j 

ox 




(A.19) 


for J = 1,..., TO - 1. 


The Dirichlet boundary condition at a: = 0 is discretized by Wq = w(jrr) = w{0) = 0, while the Neumann 


boundary condition is discretized by introducing a ghost node 31” 

;^m+l _ 

'= 0 . 


= 31™ + and taking 


\ 7i™+i / 


Then Cw{x^) « (Cw)-^ for j = 1,... ,to, where 

CeSR. 

Note C is self adjoint and positive definite with respect to the inner product 


(A.20) 


i=i 


if f and g are viewed as dual-grid discretizations of functions / and 'g satisfying the boundary conditions in 
(A.181, then this discrete inner product is the midpoint-rule approximation of the inner product 
From (A.17) and (A.20), we find A and C are similar; in particular 

A=S~^CS and A = RCR^\ (A.21) 

(This is the only place in the proof where our notation does not work perfectly — in particular, S acts 

on dual-grid vectors while R acts on primary-grid vectors.) From this we obtain the following identities, 
which prove useful in forthcoming calculations: 


SA sm 


^0.5r\/Aj = sin^O. 


5rVC 


g-i/2g 


sin^ 


sin I 0.5tV a 


A ^^^R = RC sm 


in ^O.SrVEj . 


(A.22) 


^ Since all of the vectors we consider are in and all of the matrices are in we are allowed to intermix notations 

in matrix-vector multiplication, e.g., is well defined in a linear-algebraic sense; however, we are viewing the matrices and 

vectors as discretizations of differential operators and functions, respectively, on certain grids, so it is important to distinguish 
between those defined on the primary grid versus those defined on the dual grid. 
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We will prove the first of these identities — the second identity can be proved analogously. We have 


SA ^ sin ^O-Stn/a j = S ^ 
' ' i=n 


(r/2) 


2i + l 


(2j + l)! 

g - (r/2)^ 

h (2J + 1)! 


(r/2)^ 


(2j + 1) 


S ^c^'s 

2j+l 






sin ^ 


= sin I O.SrV C 1 C ^ S. 


Next, we define the matrices 
- 1/2 


sin ^ 


-T 2~-i 


At-E-AA sinlO.STvA and A,, e-V sinlO.STvA A VA; 


in ^O.f 


- 1 / 2 - 


(A.23) 


At- and At- are discrete approximations of Lt- and Lr , respectively. We consider the following discrete 
approximation to (3.20): 


W/c-Wfe_i ~ r ; n o 1 

= V At-u^ for fc = 0,..., 2n - 1, 


T 

Ufc+1 - Ufc 


= -VAt- Wfc for fc = 0,..., 2n - 2, 


(A.24) 


Uq = b, Wo + w_i = 0. 


Applying - VA,- to the first equation in (A.241 and simplifying the result via the second equation in (A.24) 
gives 


Ufe+i - 2Ufc + Ufe_i , n o o 

---= -VA,.V A,-Ufc for fc = 0,..., 2n - 2. 


(A.25) 


The initial conditions for this iteration are Uq = b and Ui = u_i since, by the second equation in ( |A.24 ) 
(applied for fc = 0 and k = -1), 

Ui-U_i Ui-Uo Uo-ll-i \ n 

-=-+-= -VA,. (wo + w_i) = 0. 


The operator on the right-hand side of (A.25) satisfies 


1 




vLr-Lr ^-VA^Y At- 


■ sm 


^0.5T\/Aj 

( 


I - cos I rV A 


A ^ RS A ^ sinfo.STVAl 


=A 


= ?(P). 


where P e cos^rv Aj. This, in combination with (A.25), implies satisfies the recursion 

^ ^ for A: = 0, ...,2n-2, Uo = b, ui = u_i, 

which is a discrete approximation of (3.17). Note in the continuum limit we have -vLt^Lt'^ = ^ (P). 


We now apply the operator V At- to the second equation in (A.24) and simplify using the first equation 

Wfc+i - 2Wfc -I-Wfc_i ^ r , n o o 

---=-V ArYA^Wk for A: = 0,..., 2n - 2. 


in (A.24) to find 


(A.26) 
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The initial conditions for this recursion are Wq + Wi = 0 and (taking fc = 0 in the first equation in (A.24) and 
using (A.22)) 


r ~-i- 


- 1/2 


-l/2j 


Wo=-V A^b=SA ' sin|^0.5TVAjb = sin|^0.5rVCjC Sb. 

This is a discrete approximation to wq = sin(o.5r\/c') — Moreover, by (A.22) we have 

\ / V ox - 


-—LtvL^ 

V 


^ « -v"^a,va!’ 


- 1/2 


4 


dn 


dn ^0.5rV^j sin ^Oi 


- 1 / 2 - 


= —sin I 0.5rV A 1 sin I 0.5rV A 1 A R 


-i/A 


- 1/2 


= —^sinlO.STVClC ' SRC ' sin 


in ^0.5rV^ j 


=c 


= ?(Pc). 


where Pc = cos^ry/cj- Then ( A.26[ ) implies Wk satisfies the recursion 

Wfc+i - 2wfe + Wfc_i fin o o 

- — -=C(PcjWfc for A: = 0, ...,2n-2, 

/ /;:r\ —-i/2- 

Wq -1 w_i = 0, Wq = sin I O.StV C C Sb, 


which is a discrete approximation of (3.19). Again, in the continuum limit, we have -^LpuLr'^ = ^ {Pc)- 
Finally, we must prove that is indeed the adjoint of Lr with respect to the inner product (•, •) ■ 

Let /,(3 € ^^[OjXniax] such that Lrf, Lx'^'g e L^[0,afinax] with / satisfying the boundary conditions in 
( |A.14[ ) and g satisfying the boundary conditions in (A.18). Also, let f e [/(o;^), ...,/(af")] and g e 
[ g(3r^),... We define the inner products 

m _ m _ 

= E and (f, 1)7^ = E ■ 


1=1 


1=1 


Then, using (A.17), (A.23), and the fact that functions of A are self adjoint with respect to we 

obtain 


{Lrf,V) 


L'^[0,Xn 


(A4,g>77 


2 / 
r 
2 


- 1/2 


sin^O.f 

dn^O.f 


= | —AA ^ sin ( 0.5 tV A ) f, g 


= -^TA sm(0.5rVA)f,g 


- 1/2 


= - ^A sin 
.- 1/2 


^0.5T\/Ajf, 

i^O.f 


rr\T-^ 

T g 


(2 (S’") 
P{R^) 


= ^(A ^''sin(0.5TV'A)f,Vdiag (l/F,..., l/h'")T'^g 


h/v 


A ^^^sin^O.Sr'yAj f,Rg|_ 
|f, sin ^ 


IO.StVAIA ^^^Rg 


2~-i 


.( 0 . 


1 / 2 - 


= f,-V sin 0.5rVA A ' Rg 


ilLr^g) 


L2[o,3 
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A.7 Proof of Proposition |3.9| 


First, we use Algorithm |3.1| to show that ui and U 2 are orthogonal. We have 

{u2,Ul)i/y = {ui -JjVLr’^Wl,Ul)^ 


' Ijv 

-1 


L'^[0,Xn 


= 7r 

= 7l ^ - 7i , = ArMl 

= 7-1 - 7 ^ 7-1 {wi,wi)^ 

= 7r^-7r^ 

= 0. 

Similarly, {w 2 ,wi)^= 0. 

Now, suppose for induction that, via Algorithm 3.1 we have constructed ui,... ,Uj such that {uj,Uk)ii-jj = 
0 for k = 1 and wi,...,Wj such that {wj,Wk)^ = 0 for k = - 1. Our goal is to show 

(rtj+i, = 0 for k = 1,... ,j. Proceeding as in the previous paragraph, we find 

{Uj + l,Uk)y^ = {uj - l3vLjwj,Uk)y- 

= (Mj,Mfc)i/ir-7j (wj, 

\ V IV 

= {Uj , Mfc {Wj , Wk - Wk-l)^ ■ 

By the induction hypothesis, the last expression above is zero for k = 1,..., j - 1, while for fc = j it is equal to 

-if = 0- 

A similar argument shows {wj+i,Wk)-j; = 0 for /c = 1,..., j. 

Finally, the equalities 

span{Mi,...,M„} = /C“(?Io,P) and spanjwi,... ,iZ;„} =({cq, Pc) 

are corollaries of Lemmas |3.10 and |3.11 respectively, in combination with the fact that the Lanczos algorithm 
generates an orthonormal basis for the Krylov subspace /C„(6, P), where b is the starting vector and B is 
the operator in question [4(1] . 


A.8 Proofs of Lemmas 13.101 and 13.111 

From Algorithm |3 .1 1 we have 

Uj + l = Uj - ^jVhfwj 

= Uj - JjVLr'^ ^j-1 + Ij-LrUj j 

= Uj - JjVLT'^Wj_i - 'YjJjVL-r'^-LrUj 
= Uj + 7 j 7 -_\ (uj - Uj-i) + 7 j 7 jC (p) Uj, 


(A.27) 


where the last equality follows from (3.211. 

We define dj = Ujj \uj = ff^Uj. Then ( |A.27| ) becomes 


1 ~A ^3 + 1 = 7 ; ^ (1 + l3l~3-l)^3 - l3lfllfi ^3-1 + I 3 I 3 ^ (P) ^3- 


^-1/2 
7+1 

This can be rearranged as 

where a“ a nd b'j are defined as in ( |3.30| ). Because the functions dj (j = 1,..., n) form an orthonormal set by 
Proposition 3.9 this is exactly the Lanczos three-term recurrence relation |40j . 


Lemma |3.11| may be proved similarly. 
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A.9 Proof of Lemma 13.121 

We use induction to prove this lemma for the primary orthogonalized snapshots, Uj. For the base case, we 
dehne qi{x) = 1; then ui = g” {P)) Qi is a polynomial of degree 0, and (/“(O) = 1. 

Next, let j > 2. Suppose for induction that Uk = {P)) for fc = 1,..., j, where q'^ is a polynomial of 

degree fc - 1 such that q^{0) = 1. Then Algorithm |3.1| and the induction hypothesis give (see (A.27)) 


Uj+i = (l + 7i7i-i) - 7j7j_\uj_i + 7j7jC (P) u, = gj+i {P))uu 


where 


qj+i{x) = (1 + 7j7j-i) (a;) - + Jjjjxqj(x) 

is a polynomial of degree j (since 7 ^, + 0). Moreover, by the induction hypothesis we have 

9“+i(0) = (1 + 7j7A) “ = 1- 

The proof for the dual orthogonalized snapshots is similar. 


A.10 Proof of Remark 13.131 

For simplicity, we will work in spatial coo rdina tes instead of in slowness coordinates for this proof. We define 


= 7 y^ 7 i then, thanks to Lemma 


3 '3 

Lemma 


3.10 


3.12 


we have dj = (-P)) ^i- 


and the statement of Remark 


3.13 


respectively (here P?=P?(C(P)) and s q^Ax)): 


imply that dj and satisfy the following recursions, 


1. a“ = 

2 . r = 
3.6^ 
4. dj+-_ 

end for 


0 and 'di = c =p^'di. 

Set qg = 0 and qf = 1. 

1, ..., n do 

for j = 1,..., n do 


1. a“ = (qf,xqf)^^^; 


2. r = [(x - a'l)ql - /3“ ilti] Qi 

to 

3- Pi =\/{r,r)^ g; 

1 = 

4 q« = — 


(A.28) 


end for 


Because p\ = 1 and q^ s 1 , the above recursions imply p^ = q^ if a)* = {j = l,...,n) and 6 “ = /3“ 
{j = 1,..., n - 1). Before proving this, we note the above recursions imply p| and q| are polynomials of 
degree j - 1 . 


Mimicking the derivation of (3.3) in § A.l we find 


{f{^{P))^i,9{^{P))^i)i/v^ = ° Oid){9 ° dp. 


(A.29) 


If fiO q(^) are both polynomials of degree less than or equal to n - 1, then (/ o ^)(p) and {g o ^)(p) 
are both polynomials of degree less than or equal to n - 1 with respect to the independent variable p (since 
^{p) = -^{1-p) is linear in p); thus [{f °0{9°0]{d) is a polynomial of degree less than or equal to 2 n- 2 , 
so the Gaussian quadrature from § 4.2 computes the integral in (A.29) exactly. In particular, this implies 


that the inner products in the recursion on the left-hand side of (A.28) may be replaced by the Gaussian 
quadrature rule, i.e., 

a- = {pliOdiApliOdi) = 


3 = 1 


and similarly for 6 “. Because both recursions in ( |A.28 ) have the same initialization, a standard induction 
argument shows a“ = a“ for j = 1,..., n and 6 “ = /?“ for j = 1,..., n - 1. As stated above, this implies p| = q| 
for q = 1 ,..., n. 
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A.11 Proof of Proposition [KT] 

We will prove the proposition for the dual snapshots; the proof for the primary snapshots is similar. 

The pr oof is by induct ion. If j = 1, then, according to (5.21, = {cq; on the other hand, thanks to 


Algorithm 


3.1 


and Lemma 


3.7 


Wi ='^i=LrUo = ^Wo = d^w' 


Next, suppose for induction that Wi = for i = 1,..., j - 1 and define 






Then Wj and 


are in spanjsi,..., Sj-i,Wj-i}, so 


_ _Qg \ _ 


t-i 


(A.30) 


for some coefficients pi. We take the inner product of both sides of the above equation with for k = 
1,... ,j - 1 and use the fact that {wj,Si)~ = = 0 for z = 1,..., j - 1 to find 

0 = (wj - Ujf ®, Sk) = Pk+ Pj (fCj-l, Sfc)- . 


Substituting this into (A.30) gives 


j-i 


'^3 = - E Pi {Wj-uS^)~Si + PjWj-i = (1 + Pj)wJ^. 

i=l 

Next, by (5.2), Lemma ( |A.31[ ), and Lemma 3.12[ we have 

ajf = Qf (e (Pc)) wi = (l + P,)-% = (1 + P.r^qf (? (Pc)) 

where 

(C (Pc)) ^ ^ (Pc) + TjH (Pc)] - E (C (Pc)) 


(A.31) 

(A.32) 

(A.33) 


and, by the induction hypothesis. 


Wi 

Cij - IWi-i, — 


Recall ^ (Pc ) = 0 if and only if Pc = I. Then (A.33), standard results about Chebyshev polynomials, 
Lemma[3.12[ and (A.321 imply 


QJiO) = ^(2j - 1) - E Q, (1 E 7.] = (1 + Pr 

271 i=i \ 7 i k=i / 


r<(o) = (i+p,r - 


\7i*=i / 


The conclusion of the proposition follows by taking dj = 1 + pj. 

A.12 Proof of Lemma 14.21 

To show that Uk = UTk(H.)ei, it suffices to demonstrate 

Tfc(H)ei=efc+i for fc = 0,...,n - 1, 


(A.34) 


because Uk = Uek+i- 

We prove (A.34) by induction. Since we use the Chebyshev three-term recursion formula 


rfe+i(H) = 2Hrfc(H)-rfe_i(H), 


(A. 35 ) 


the base of induction consists of the two cases k = 0, 1. 
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The case fc = 0 is trivial: 

ro(H)ei = lei = ei. 


For the case A: = 1 we observe from (3.8) that ui = cos(t^/A)uo = Puq, so 


Ti(H)ei = Hei = {U*U)-^U* PUe-y = {U*U)-^U*Puo 
= {U*U)-^U*ui = {U*U)-^U*Ue2 = 62 . 


For the induction step we use the trigonometric identity 


Puk = cos cos ^krVA^ uq 


j^COS + cos - l)TN/]4jj Uo 


(A.36) 


- 2 “fc-i) 


where the first and last equalities follow from (3.8). Then the induction hypotheses are rfc(H)ei = e^+i and 
T/j_i(H)ei = Gfe, which in conjunction with (A.35)-(A.36) imply, for k = 0,... ,n- 2, that 


n^i(H)e, = 2 HTk(H)e, - T,_,(H)e, 

— 2Hefc+i — 

= 2(U*U)-^U*PUek^i-ek 
= 2(U*U)-^U*Puk-ek 
= {U*U)-^U*{uk^i+Uk-i)-ek 
= {U*U)-^U*{Uek^2 + Uek)-ek 
= (efc+2 + e/c) - e/j = ek+2- 


For fc = 0,...,n- 1, the formula for fk is an immediate consequence of (3.9), the fact that uq = Uei, and 
the first part of this lemma: 


fk = uluk = (17ei)*C/rfe(H)ei = ef (t7*C/)T,.(H)ei. 


The proof that (4.3) holds for fc = n,..., 2n - 1 is more subtle. First, we define the operator 

H = u{u*uy^u*p, 

so C/H = HU. In fact, if gf is a polynomial, we have Ug{Il) = g{H)U. Moreover, the operator H is self 
adjoint with respect to the inner product ((•,•)). 

Next, we note that 

T^^yx) = T^ll{x)T^.yx) - Tf{x)Tr,_2{x) (A.37) 

for all j > 0, where is the j*** Chebyshev polynomial of the second kind (the identity (A.37) can be 
proved by induction on j). Then 


e?^17*t7r„+,(H)ei = ((f/ei, (H)ei)) 

= iUe, 7 ^n+j {H)Ue,)) 

= (( 1761 , - Tj^\H)Tr,. 2 {H)] Ue,)) 

= ((T}^"j(i?)17ei,T„_i(i?)C/ei)) - ((rf \i?)C/ei,r„_2(i7)C/ei)) 
= ((l7Tj^'J(H)ei,t/r„_i(H)ei)) - ((c/Tf ^(Hjei, t/r„_2(H)ei)). 


Using the identities (A.12) and the fact that Tfc(H)ei = e^+i for fc = 0,..., n - 1, we find 


C/rf^(H)ei =Tf^(P)Uei 


(A.38) 


















Inversion via projection-based model reduction 


48 


for j = 0,... ,n - 1. Using this, the fact that P is self adjoint with respect to ((•,•)}, and (A.371 in (A.381 gives 


efC/*C/T„^,(H)ei = C/e„)) - ((rf ^ (P)C/ei, C/e„_i)) 

= ((c/ei,Tf^(P)u„_2)) 

= ((uo, - 7j2)(P)P„_2(P)] uo)) 

= UQTn+j{P)uo 
= UQUn+j 
= fn+j 


for j = 0,..., n - 1. 


A.13 Proof of Lemma 14.31 


Since rjo satisfies the hypothesis of Lemma 4.1 P is of full rank; thus U*U e 
definite matrix. Let x, z 6 K". Then, since U*PU € is symmetric, we have 


is a symmetric, positive- 


= ((C/*PP)x,z),.(R„) 

= (X, (U PP)Z);2(R,X) 

= (x,(P*P)-i(P*PP)z>^.^ 
= (x, Hz)^,y; 


thus H is self adjoint with respect to (■,-)ij*u- 
Next, we symmetrize H by defining 


H E {u*uy/‘^u{u*uy^/^ = {u*uy^/'^{u* pujiwu)-^/"^ = 

Because H is symmetric, it can be orthogonally diagonalized as 


H= $0$ 


where $ ^ = Ir, 


(A.39) 


(A.40) 


and 0 is a diagonal matrix of the eigenvalues of H (which are the same as those of H since H and H are 
similar). If we define 4> e then (A.39) and ( |A.40 ) imply 

H = $0$^(P*P), where = I„x„. 


A.14 Proof of Lemma 14.41 


In order to compute U*PU and U*U we will need the inner products of the snapshots. Using (3.8), the fact 
that A (and functions of A) are self adjoint with respect to ((•, •)), and the fact that functions of A commute, 
we find, for j. A: = 0 ,..., n - 1, that 


{{uj,Uk} = ^^u(0) cos (;(A)^'^^(5, ^(O) cos 

= |<5, cos j cos ^fcrN/Aj q{A)6'^. 

Applying the trigonometric identity 

cos cos ^/ctn/A^ = ^ [^cos ^(j + k)T\/A^ + cos (a - fc)T\/Ajj 


(A.41) 
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to (A.41) we obtain 


{{uj,Uk)) = ^ [|(5,cos(0' + fc)r\/A](3'(A)<5| + |5,cos((j -/c)tn/a) g(A)i5^] 


where the snapshots with negative indices are defined using the evenness of cosine, i.e., we take ui{x) = U-i{x) 
for I < 0. Thus 

([ujjUk]) = —{fj+k + fj-k)- (A.42) 

Let us consider U*PU first. Applying the formula in (|A.36) to PU, we get 


U*PU = -U* {[U-i,Uo . . . ,M„-2 ] + [ui,M2 ■ • ■,««]) • 


(A.43) 


Using the inner product formula (A.42), the first product on the right-hand side of (A.43) becomes 


U*[U-i,Uo ■ ■ .,Un-2] = 


■ fl + /-I 

fo + fo 

f-1 + fl 

f-2 + f2 

f-n+2 fn-2 

/2 + /o 

fl + fl 

fo + /2 

f -1 + /a 

f-n+3 fn-1 

/a + fl 

/2 + /2 

fl + /a 

fo + /4 

f-n+4 fn 

,fn fn—2 

fn—1 fn-1 

fn-2 fn 

fn-3 fn+1 

fl + /2n-3 . 


Similarly, for the second product in (A.43) we have 


U*[U1,U2 ... ,Un] = 


f-1 + fl 

f-2 + f2 

f- 

+ /a 

f-i + fi 

f-n + fn 

fo + /2 

f -1 + /a 

f- 

+ fi 

f-3 + /s 

f-n+1 fn+1 

fl + /a 

fo + fi 

f- 

+ /s 

f-2 + fo 

f-n+2 fn+2 

./n-2 fn 

fn-3 ■*" fn+1 

fn-A 

+ fn+2 

fn-5 ■*" fn+3 

f-1 + f2n-l _ 


The same inner product formula applied to U*U yields 


U*U = 


fo + fo 

f-1 + fl 

f-2 + f2 

f-3 + /a 

f-n+1 fn-1 

fl + fl 

fo + f2 

f-1 + /a 

f-2 + fi 

f-n+2 fn 

f2 + f2 

fl + /a 

fo + fi 

> 

+ 

f-n+3 fn+1 

1 

1 

fn-2 fn 

fn-3 ■*" fn+1 

/n-4 ■*" fn+2 

fo + f2n-2 . 


(A.44) 


(A.45) 


(A.46) 


Finally, using the evenness of the cosine (i.e., fi = f-i for I <0), we observe that each of (A.44)-(A.46) 
can be expressed as a sum of a Toeplitz matrix and a Hankel matrix: 


U* [u_i, uq, ..., Un- 2 ] - 2 + H ) , 

U* [ui,U2 ,...,u„] = i(T- + H+), 
U*U= i(T° + H°). 
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A.15 Derivation of Algorithm |4.2| 

Let (-Ai,r;), I = 1,... ,n, be an eigenpair of M, i.e., 

Mr; + XiTi = 0. 


(A.47) 


Since M is similar to ^ (Pn), Lemma 4.1 implies -A; = C{0i) e O]; Lemma 4.1 also implies the eigenvalues 
Xi are distinct. 

We introduce the auxiliary variables 


sij = -- and sij = -- tor / = 1,..., n, j = 


^/V7^ 




Let 


gi and g; = S [ri,i,'si^i,... ■, 


(A.48) 

(A.49) 


where <5 is a constant we will determine later. We also define 

^0 1 
1 0 -1 
-1 0 1 

O E 10-1 


-1 0 1 
1 0 


= and TEOr'\ 


where P is defined in (3.28). Then, in combination with (A.48), (A.47) may be written in first-order form as 

(A.50) 

where 


LQ — QT, 

L = diag (yVi \/^: “\/^i • • ■ I “'yV) j 


and 




T 

Si 

—T 

Si 


Sr? 


Ml ■■■ ^r. 


T)2nx2n 


(A.51) 


(A.52) 


Note that (A.50) is an eigendecomposition of = P ^O, i.e., = Q“' L. This may be written in a 

different basis as 


T' P^/^Q^ = P^/^Q^L, where T' e P“"/^ = P~"/^OP 


_ T^l/2rriX''r\—1/2 t^ —1/2^ 


- 1/2 


(A.53) 


Since T is symmetric and we are assuming all eigenvectors of symmetric matrices are normalized with 
Euclidean norm 1, we have 

^2nx2n ~ (pi/2Q^)’^pi/2Q^ = QPQ'^; 


this implies 


Q^Q = P'\ 


(A.54) 


The algorithm is essentially given by (A.50) and (A.54); all that remains is for us to initialize the algorithm 

(A.55) 


appropriately, i.e., we need to compute 

Ml = <5[?'i.i,fi,i,r2,i,r2,i,... 


We begin by determining the constant <5 from ( |A.49[ ). First, by ( |A.52[ )-( |A.53[ ), 


L2(B2") 


1/2 isl/2 


= 50 P ' r,,D r. 


‘) 






= 25^ 


(A.56) 
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- 1/2 


where D e diag (71,..., 7„). The last inequality above holds because D ' r; is an eigenvector of the symmet¬ 
ric matrix M; similarly, by eliminating from the recursion (4.28), it can be shown that is 


an eigenpair of the symmetric matrix N e , where -M = R“^ R with R upper triangular is the Cholesky 

decomposition of -M. (The previous analysis also holds with g; replaced by g;.) 

-1/2 


Next, D r; and X/ are normalized eigenvectors of M = ^ (Pn) by (A.47) and (4.14), respectively. Thus 


(4.15), the fact that the eigenvalues -A; are distinct (by Lemma 4.1), and (4.29) imply 


j/f/c= (efX/)^ = 




n,i = yi- 


(A.57) 


(A.50) and (3.26). 


Then (A.50)-(A.52) and (A.54)~(A.57) give us the algorithm for computing 7^- and 7^ , which we summarize 
in Algorithm 4.2 Algorithm |4.2|is isomorphic to Algorithm |3.1| this is due to the close relationship between 
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